Last year I was working for a customer that was upgrading several AIX 5.3 systems to AIX 6.1. The migrations were successful for the most part, but we did encounter one issue that took a little time to resolve.
The customer was using nimadm
to migrate. This process worked fine, however, on a couple of systems a
strange error was encountered after the migration. The LPAR was booted
into AIX 6.1 and everything came up fine. The applications were started
and users started accessing the system.
It
was several days later, when the AIX administrator attempted to
configure new storage on the AIX system, when the first sign of trouble
appeared. He had asked his Storage guy to assign a couple of new
disks to his LPAR (via NPIV/VFC). As soon as the Storage admin had
completed the assignment, the AIX admin ran cfgmgr to detect and configure the new hdisks. Immediately, cfgmgr reported the following error:
Method error (/usr/lib/methods/cfgscsidisk):
0514-023 The specified device does not exist in the
customized device configuration database.
Initially,
the AIX team suspected there was some fault with either the storage
device or the zoning of the disk. Both of these items were checked and
doubled-checked and were found to be OK. Our next step was to run cfgmgr
again, but this time we wanted a greater level of detail captured. To
do this we used the following environment variable to force cfgmgr to be ‘more verbose’.
# export CFGLOG="cmd,meth,lib,verbosity:9"
We ran cfgmgr and went to the /var/adm/ras/cfglog file to view the results with the alog command. However, we noticed that the cfglog file had a size of zero (0) and contained no data.
# cd /var/adm/ras
# ls –l cfglog
-rw-r----- 1 root system 0 May 16 13:22 cfglog
We decided to recreate the cfglog alog file and run mkdev again to reproduce the disk configuration error.
# rm cfglog
# echo "Create cfglog `date`"|alog -t cfg
# mkdev -l hdisk0
Method error (/usr/lib/methods/cfgscsidisk):
0514-023 The specified device does not exist in the
customized device configuration database.
This time we found some useful data in the cfglog file.
# alog -t cfg -o
MS 31981804 28835876 /usr/lib/methods/cfgscsidisk -l hdisk39
M4 31981804 Parallel mode = 0
M4 31981804 Get CuDv for hdisk39
M4 31981804 Get device PdDv, uniquetype=disk/fcp/htcvspmpio
M4 31981804 Get parent CuDv, name=fscsi0
M4 31981804 ..is_mpio_capable()
M4 31981804 Device is MPIO
M4 31981804 ..get_paths()
M4 31981804 Getting CuPaths for name='hdisk39'
M4 31981804 Found 1 paths
M0 31981804 cfgcommon.c 225 mpio_init error, rc=23
MS 28835892 31981568 /usr/lib/methods/cfgscsidisk -l hdisk0
M4 28835892 Parallel mode = 0
M4 28835892 Get CuDv for hdisk0
M4 28835892 Get device PdDv, uniquetype=disk/fcp/htcvspmpio
M4 28835892 Get parent CuDv, name=fscsi0
M4 28835892 ..is_mpio_capable()
M4 28835892 Device is MPIO
M4 28835892 ..get_paths()
M4 28835892 Getting CuPaths for name='hdisk0'
M4 28835892 Found 2 paths
M0 28835892 cfgcommon.c 225 mpio_init error, rc=23
MS 25690326 27328608 /usr/lib/methods/cfgscsidisk -l hdisk0
M4 25690326 Parallel mode = 0
M4 25690326 Get CuDv for hdisk0
M4 25690326 Get device PdDv, uniquetype=disk/fcp/htcvspmpio
M4 25690326 Get parent CuDv, name=fscsi0
M4 25690326 ..is_mpio_capable()
M4 25690326 Device is MPIO
M4 25690326 ..get_paths()
M4 25690326 Getting CuPaths for name='hdisk0'
M4 25690326 Found 2 paths
M0 25690326 cfgcommon.c 225 mpio_init error, rc=23
The configuration method was attempting to configure a disk device type of htcvspmpio (which was correct) but it was unable to configure the device paths (mpio_init error rc=23). We suspected that the system was missing some sort device driver support for the type of storage in use.
Cutting a very long story short, we determined, with the help of the IBM AIX support team, that the issue stemmed from “old”
AIX installation media used to create the AIX 6.1 TL6 SP5 SPOT and
lppsource on the NIM master. Old AIX 6.1 media was originally used
(several years ago) to create the NIM resources and was gradually
updated over time, all the way up to TL6 SP5.
IBM support identified that the older install media contained a liblpp.a file that was missing the necessary PdPathAt ODM files. Newer install media contained a fix to add the appropriate entries to the bos.rte.cfgfiles. e.g.
SPOT created using OLD install media.
======================================
# ar xv /export/spot/spotaix610605_OLD/usr/lpp/bos/liblpp.a bos.rte.cfgfiles
x - bos.rte.cfgfiles
# grep PdPathAt bos.rte.cfgfiles
#
SPOT created using NEW install media.
======================================
# ar xv /export/spot/spotaix610605_NEW/usr/lpp/bos/liblpp.a bos.rte.cfgfiles
x - bos.rte.cfgfiles
# grep PdPathAt bos.rte.cfgfiles
/usr/lib/objrepos/PdPathAt v4preserve
/usr/lib/objrepos/PdPathAt.vc v4preserve
We
recreated the SPOT and lppsource resources using newer media on the NIM
master. We were then able to migrate the AIX 5.3 LPAR to 6.1 without
encountering the issues faced previously.
Tags:
gibson
not
in
0514-023
database.
customized
configuration
specified
pdpathat
does
the
/usr/lib/methods/cfgscsid...
liblpp.a
exist
vsp
chris
method
aix
error
hds
odm
device