A colleague of mine was planning to modify the max_xfer_size attribute on a couple of FC adapters in one of his AIX LPARs. As he was describing his plan to me, I asked him how he intended to back out of the change should the LPAR fail to boot after the modifications. But, what could possibly go wrong? he fired back. I advised him to use multibos to create a standby (backup) instance of the AIX OS, just in case. He begrudgingly did so, just to keep me happy.

The next day he told me the following tale.

He had modified the FC adapters max_xfer_size attribute as planned. First, checking the current values, for the attribute on both adapters.

aixlpar1 : / # lsattr -El fcs0 -a max_xfer_size

max_xfer_size 0x100000 Maximum Transfer Size True

aixlpar1 : / # lsattr -El fcs1 -a max_xfer_size

max_xfer_size 0x100000 Maximum Transfer Size True

Hed created a standby AIX instance before making changes to the adapters. He also prevented multibos from changing the bootlist to the standby boot logical volume (BLV).

aixlpar1 : / # multibos -sXt

Initializing multibos methods ...

Initializing log /etc/multibos/logs/op.alog ...

Gathering system information ...

+-----------------------------------------------------------------------------+

Setup Operation

+-----------------------------------------------------------------------------+

Verifying operation parameters ...

Creating image.data file ...

He modified the FC adapters as planned.

aixlpar1 : / # chdev -l fcs0 -a max_xfer_size=0x200000 -P

fcs0 changed

aixlpar1 : / # chdev -l fcs1 -a max_xfer_size=0x200000 -P

fcs1 changed

aixlpar1 : / # lsattr -El fcs0 -a max_xfer_size

max_xfer_size 0x200000 Maximum Transfer Size True

aixlpar1 : / # lsattr -El fcs1 -a max_xfer_size

max_xfer_size 0x200000 Maximum Transfer Size True

He verified that the standby instance still held the original values for both FC adapters.

aixlpar1 : / # multibos -S

Initializing multibos methods ...

Initializing log /etc/multibos/logs/op.alog ...

Gathering system information ...

+-----------------------------------------------------------------------------+

Multibos Shell Operation

+-----------------------------------------------------------------------------+

Verifying operation parameters ...

+-----------------------------------------------------------------------------+

Mount Processing

+-----------------------------------------------------------------------------+

Mounting all standby BOS file systems ...

Mounting /bos_inst

Mounting /bos_inst/usr

Mounting /bos_inst/var

Mounting /bos_inst/opt

+-----------------------------------------------------------------------------+

Multibos Root Shell

+-----------------------------------------------------------------------------+

Starting multibos root shell ...

Active boot logical volume is hd5.

Script command is started. The file is /etc/multibos/logs/scriptlog.120713124518.txt.

aixlpar1 : / # lsattr -El fcs0 -a max_xfer_size

max_xfer_size 0x100000 Maximum Transfer Size True

aixlpar1 : / # lsattr -El fcs1 -a max_xfer_size

max_xfer_size 0x100000 Maximum Transfer Size True

aixlpar1 : / # exit

Script command is complete. The file is /etc/multibos/logs/scriptlog.120713124518.txt.

Stopping multibos root shell ...

Compressing script log file ...

Compressed script log file is /etc/multibos/logs/scriptlog.120713124518.txt.Z

+-----------------------------------------------------------------------------+

Mount Processing

+-----------------------------------------------------------------------------+

Unmounting all standby BOS file systems ...

Unmounting /bos_inst/opt

Unmounting /bos_inst/var

Unmounting /bos_inst/usr

Unmounting /bos_inst

Log file is /etc/multibos/logs/op.alog

Return Status = SUCCESS

Then he manually changed the LPARs boot list to include the standby BLV.

aixlpar1 : / # bootlist -m normal hdisk2 blv=hd5 hdisk2 blv=bos_hd5

aixlpar1 : / # bootlist -m normal -o

hdisk2 blv=hd5 pathid=0

hdisk2 blv=hd5 pathid=1

hdisk2 blv=bos_hd5 pathid=0

hdisk2 blv=bos_hd5 pathid=1

He carefully recorded the bootlist output, just in case the boot failed with new max_xfer_size values. He could use the vdevice name and location to manually select the standby BLV to start the system in an emergency.

aixlpar1 : / # bootlist -m normal -ov

'ibm,max-boot-devices' = 0x5

NVRAM variable: (boot-device=/vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:2 /vdevice/vfc-client@3000001e/disk@50060e8006d0207a,1000000000000:2 /vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:4 /vdevice/vfc-client@3000001e/disk@50060e8006d0207a,1000000000000:4)

Path name: (/vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:2)

match_specific_info: ut=disk/fcp/htcvspmpio

hdisk2 blv=hd5 pathid=0

Path name: (/vdevice/vfc-client@3000001e/disk@50060e8006d0207a,1000000000000:2)

match_specific_info: ut=disk/fcp/htcvspmpio

hdisk2 blv=hd5 pathid=1

Path name: (/vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:4)

match_specific_info: ut=disk/fcp/htcvspmpio

hdisk2 blv=bos_hd5 pathid=0

Path name: (/vdevice/vfc-client@3000001e/disk@50060e8006d0207a,1000000000000:4)

match_specific_info: ut=disk/fcp/htcvspmpio

hdisk2 blv=bos_hd5 pathid=1

He restarted the LPAR using the primary BLV, with the modified FC attributes.

The system hung at LED 554.

hscroot@hmc1:~> lsrefcode -m 795-1 -r lpar --filter "lpar_names=aixlpar1" -F lpar_name:refcode

aixlpar1:0554

He restarted the LPAR in Open Firmware mode.

image

At the Open Firmware prompt, he entered the following to boot the LPAR from the standby instance BLV:

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

1 = SMS Menu 5 = Default Boot List

8 = Open Firmware Prompt 6 = Stored Boot List

Memory Keyboard Network SCSI Speaker ok

0 > boot /vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:4 |

The system booted OK on the standby AIX instance.

Elapsed time since release of system processors: 28858 mins 55 secs

-------------------------------------------------------------------------------

Welcome to AIX.

boot image timestamp: 02:39:13 07/13/2012

The current time and date: 02:54:02 07/13/2012

processor count: 1; memory size: 4096MB; kernel size: 35062697

boot device: /vdevice/vfc-client@30000014/disk@50060e8006d0206a,1000000000000:4

-------------------------------------------------------------------------------

The FC adapters were running on the previous values, stored in the standby instance of AIX.

aixlpar1 : / # lsattr -El fcs0 -a max_xfer_size

max_xfer_size 0x100000 Maximum Transfer Size True

aixlpar1 : / # lsattr -El fcs1 -a max_xfer_size

max_xfer_size 0x100000 Maximum Transfer Size True

The original AIX instance still held the modified values for the FC adapters.

aixlpar1 : / # multibos -S

Initializing multibos methods ...

Initializing log /etc/multibos/logs/op.alog ...

Gathering system information ...

+-----------------------------------------------------------------------------+

Multibos Shell Operation

+-----------------------------------------------------------------------------+

Verifying operation parameters ...

+-----------------------------------------------------------------------------+

Mount Processing

+-----------------------------------------------------------------------------+

Mounting all standby BOS file systems ...

Mounting /bos_inst

Mounting /bos_inst/usr

Mounting /bos_inst/var

Mounting /bos_inst/opt

+-----------------------------------------------------------------------------+

Multibos Root Shell

+-----------------------------------------------------------------------------+

Starting multibos root shell ...

Active boot logical volume is bos_hd5.

Script command is started. The file is /etc/multibos/logs/scriptlog.120713125542.txt.

aixlpar1 : / # lsattr -El fcs0 -a max_xfer_size

max_xfer_size 0x200000 Maximum Transfer Size True

aixlpar1 : / # lsattr -El fcs1 -a max_xfer_size

max_xfer_size 0x200000 Maximum Transfer Size True

aixlpar1 : / # exit

Script command is complete. The file is /etc/multibos/logs/scriptlog.120713125542.txt.

Stopping multibos root shell ...

Compressing script log file ...

Compressed script log file is /etc/multibos/logs/scriptlog.120713125542.txt.Z

+-----------------------------------------------------------------------------+

Mount Processing

+-----------------------------------------------------------------------------+

Unmounting all standby BOS file systems ...

Unmounting /bos_inst/opt

Unmounting /bos_inst/var

Unmounting /bos_inst/usr

Unmounting /bos_inst

Log file is /etc/multibos/logs/op.alog

Return Status = SUCCESS

The cause of the 554 hang appeared to be related to the fact that the VIOS physical adapters needed their max_xfer_size value changed to the new value before the client LPAR virtual fibre channel adapters were modified.

My colleague was glad he used multibos. It saved his bacon.