A colleague of mine was planning to modify the max_xfer_size attribute on a couple of
FC adapters in one of his AIX LPARs. As he was describing his plan to me, I
asked him how he intended to back out of the change should the LPAR fail to
boot after the modifications. “But, what could
possibly go wrong?” he fired back. I advised him to use multibos to create a standby (backup)
instance of the AIX OS, just in case. He begrudgingly did so, just to keep me
happy.
The next day he told me the following
tale.
He had modified the FC adapters max_xfer_size attribute as planned.
First, checking the current values, for the attribute on both adapters.
aixlpar1
: / # lsattr -El fcs0 -a max_xfer_size
max_xfer_size
0x100000 Maximum Transfer Size True
aixlpar1
: / # lsattr -El fcs1 -a max_xfer_size
max_xfer_size
0x100000 Maximum Transfer Size True
He’d created a standby AIX instance before
making changes to the adapters. He also prevented multibos from changing the bootlist to the standby boot logical
volume (BLV).
aixlpar1
: / # multibos -sXt
Initializing
multibos methods ...
Initializing
log /etc/multibos/logs/op.alog ...
Gathering
system information ...
+-----------------------------------------------------------------------------+
Setup
Operation
+-----------------------------------------------------------------------------+
Verifying
operation parameters ...
Creating
image.data file ...
He modified the FC adapters as planned.
aixlpar1
: / # chdev -l fcs0 -a max_xfer_size=0x200000 -P
fcs0
changed
aixlpar1
: / # chdev -l fcs1 -a max_xfer_size=0x200000 -P
fcs1
changed
aixlpar1
: / # lsattr -El fcs0 -a max_xfer_size
max_xfer_size
0x200000 Maximum Transfer Size True
aixlpar1
: / # lsattr -El fcs1 -a max_xfer_size
max_xfer_size
0x200000 Maximum Transfer Size True
He verified that the standby instance still
held the original values for both FC adapters.
aixlpar1
: / # multibos -S
Initializing
multibos methods ...
Initializing
log /etc/multibos/logs/op.alog ...
Gathering
system information ...
+-----------------------------------------------------------------------------+
Multibos
Shell Operation
+-----------------------------------------------------------------------------+
Verifying
operation parameters ...
+-----------------------------------------------------------------------------+
Mount
Processing
+-----------------------------------------------------------------------------+
Mounting
all standby BOS file systems ...
Mounting
/bos_inst
Mounting
/bos_inst/usr
Mounting
/bos_inst/var
Mounting
/bos_inst/opt
+-----------------------------------------------------------------------------+
Multibos
Root Shell
+-----------------------------------------------------------------------------+
Starting
multibos root shell ...
Active
boot logical volume is hd5.
Script
command is started. The file is /etc/multibos/logs/scriptlog.120713124518.txt.
aixlpar1 : / #
lsattr -El fcs0 -a max_xfer_size
max_xfer_size
0x100000 Maximum Transfer Size True
aixlpar1 : / #
lsattr -El fcs1 -a max_xfer_size
max_xfer_size
0x100000 Maximum Transfer Size True
aixlpar1 : / #
exit
Script
command is complete. The file is /etc/multibos/logs/scriptlog.120713124518.txt.
Stopping
multibos root shell ...
Compressing
script log file ...
Compressed
script log file is /etc/multibos/logs/scriptlog.120713124518.txt.Z
+-----------------------------------------------------------------------------+
Mount
Processing
+-----------------------------------------------------------------------------+
Unmounting
all standby BOS file systems ...
Unmounting
/bos_inst/opt
Unmounting
/bos_inst/var
Unmounting
/bos_inst/usr
Unmounting
/bos_inst
Log
file is /etc/multibos/logs/op.alog
Return
Status = SUCCESS
Then he manually changed the LPARs boot
list to include the standby BLV.
aixlpar1
: / # bootlist -m normal hdisk2 blv=hd5
hdisk2 blv=bos_hd5
aixlpar1
: / # bootlist -m normal -o
hdisk2
blv=hd5 pathid=0
hdisk2
blv=hd5 pathid=1
hdisk2 blv=bos_hd5 pathid=0
hdisk2 blv=bos_hd5 pathid=1
He carefully recorded the bootlist output, just in case the boot
failed with new max_xfer_size values.
He could use the vdevice name and location
to manually select the standby BLV to start the system in an emergency.
aixlpar1
: / # bootlist -m normal -ov
'ibm,max-boot-devices'
= 0x5
NVRAM
variable:
(boot-device=/vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:2
/vdevice/vfc-client@3000001e/disk@50060e8006d0207a,1000000000000:2
/vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:4
/vdevice/vfc-client@3000001e/disk@50060e8006d0207a,1000000000000:4)
Path
name: (/vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:2)
match_specific_info:
ut=disk/fcp/htcvspmpio
hdisk2
blv=hd5 pathid=0
Path
name: (/vdevice/vfc-client@3000001e/disk@50060e8006d0207a,1000000000000:2)
match_specific_info:
ut=disk/fcp/htcvspmpio
hdisk2
blv=hd5 pathid=1
Path name: (/vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:4)
match_specific_info:
ut=disk/fcp/htcvspmpio
hdisk2
blv=bos_hd5 pathid=0
Path
name: (/vdevice/vfc-client@3000001e/disk@50060e8006d0207a,1000000000000:4)
match_specific_info:
ut=disk/fcp/htcvspmpio
hdisk2
blv=bos_hd5 pathid=1
He restarted the LPAR using the primary
BLV, with the modified FC attributes.
The system hung at LED 554.
hscroot@hmc1:~>
lsrefcode -m 795-1 -r lpar --filter "lpar_names=aixlpar1" -F
lpar_name:refcode
aixlpar1:0554
He restarted the LPAR in Open Firmware
mode.
At the Open Firmware prompt, he entered the
following to boot the LPAR from the standby instance BLV:
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
1 = SMS Menu 5 = Default Boot List
8 = Open Firmware Prompt 6 = Stored Boot List
Memory
Keyboard Network SCSI
Speaker ok
0
> boot
/vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:4 |
The system booted OK on the standby AIX
instance.
Elapsed
time since release of system processors: 28858 mins 55 secs
-------------------------------------------------------------------------------
Welcome to AIX.
boot image timestamp:
02:39:13 07/13/2012
The current time and date:
02:54:02 07/13/2012
processor count: 1; memory size: 4096MB; kernel size: 35062697
boot
device: /vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:4
-------------------------------------------------------------------------------
The FC adapters were running on the previous
values, stored in the standby instance of AIX.
aixlpar1
: / # lsattr -El fcs0 -a max_xfer_size
max_xfer_size 0x100000 Maximum Transfer Size True
aixlpar1
: / # lsattr -El fcs1 -a max_xfer_size
max_xfer_size 0x100000 Maximum Transfer Size True
The original AIX instance still held the modified values for the FC adapters.
aixlpar1
: / # multibos -S
Initializing
multibos methods ...
Initializing
log /etc/multibos/logs/op.alog ...
Gathering
system information ...
+-----------------------------------------------------------------------------+
Multibos
Shell Operation
+-----------------------------------------------------------------------------+
Verifying
operation parameters ...
+-----------------------------------------------------------------------------+
Mount
Processing
+-----------------------------------------------------------------------------+
Mounting
all standby BOS file systems ...
Mounting
/bos_inst
Mounting
/bos_inst/usr
Mounting
/bos_inst/var
Mounting
/bos_inst/opt
+-----------------------------------------------------------------------------+
Multibos
Root Shell
+-----------------------------------------------------------------------------+
Starting
multibos root shell ...
Active
boot logical volume is bos_hd5.
Script
command is started. The file is /etc/multibos/logs/scriptlog.120713125542.txt.
aixlpar1 : / #
lsattr -El fcs0 -a max_xfer_size
max_xfer_size
0x200000 Maximum Transfer Size True
aixlpar1 : / #
lsattr -El fcs1 -a max_xfer_size
max_xfer_size
0x200000 Maximum Transfer Size True
aixlpar1 : / #
exit
Script
command is complete. The file is /etc/multibos/logs/scriptlog.120713125542.txt.
Stopping
multibos root shell ...
Compressing
script log file ...
Compressed
script log file is /etc/multibos/logs/scriptlog.120713125542.txt.Z
+-----------------------------------------------------------------------------+
Mount
Processing
+-----------------------------------------------------------------------------+
Unmounting
all standby BOS file systems ...
Unmounting
/bos_inst/opt
Unmounting
/bos_inst/var
Unmounting
/bos_inst/usr
Unmounting
/bos_inst
Log
file is /etc/multibos/logs/op.alog
Return
Status = SUCCESS
The cause of the 554 hang appeared to be related
to the fact that the VIOS physical adapters needed their max_xfer_size value changed to the new value before the client LPAR
virtual fibre channel adapters were modified.
My colleague was glad he used multibos. It saved his bacon.
Tags:
bacon
multibos
chris_gibson
aix
max_xfer_size