nimadm - multibos error!?
I’ve been working with a customer recently on an issue with nimadm. They were attempting to migrate a system from AIX 5.3 to 7.1 using nimadm. The NIM client AIX level was 5.3 TL12 SP4 and the NIM master was running AIX 7.1 TL1 SP1.
lpar1 : / # oslevel -s 5300-12-04-1119
root@nim1 : / # oslevel -s 7100-01-01-1141
They were running the following nimadm command:
# nimadm -j nimadmvg -c lpar1 -s spotaix710101 -l lpp_sourceaix710101 -d hdisk1 -Y
The nimadm operation would always fail at phase 11:
+---
We also found the init_multibos
error in the /var
Tue Nov 22 14:48:12 EETDT 2011 cmd:
/ALT Verifying altinst_rootvg... alt_disk_copy: 0505-218 ATTENTION: init_multibos() returned an unexpected result. Cleaning up.
Given that the error appeared to be related to init_multibos, we assumed the failure was due to some multibos checks being performed by alt_disk_copy on the client. The client system did not have an existing multibos standby instance. So, we tried two things: First we created a standby instance on the client (multibos –s –X) and re-tried the nimadm operation. This failed. Next we removed the standby instance (multibos –R) and re-tried the nimadm operation. This worked and the client then migrated to AIX 7.1 successfully. We re-tried the same operations (i.e. create standby instance, remove standby instance & nimadm) several times and each worked as expected.
So it appeared that the unofficial work around to this problem would be to create and then remove a standby multibos instance prior to the nimadm migrate. However, the customer has over 200 LPARs that they need to migrate to AIX 7.1. If possible they would really rather avoid this extra step in the AIX 7.1 migration plan. We’ve made contact with IBM support and are hoping they can assist us in identifying the root cause of the issue and provide us with an official solution to the problem.
And just yesterday we hit the same problem when migrating from AIX 6.1 to 7.1 using nimadm. I’ll update my blog with any progress we make with this problem. In the meantime, our unofficial work around will get us “out of hot water”!
UPDATE (14/12/2011): The simple fix is to remove the /bos_inst directory before attempting the AIX migration. i.e.
# rm –r /bos_inst
In my nimadm article I wrote:
Confirm the legacy LV names are now in use that is, not bos_.
Remove the old multibos instance.
Unfortunately, it appears that ‘multibos –R’ may not clean up the /bos_inst directory. If this directory exists the nimadm operation will most likely fail.
|
Chris, bos.alt_disk_install.rte on the AIX 5.3 will do no good. Hopefully no harm either. nimadm is running most of the commands from the master's SPOT. Maybe save the alt_rootvg creation but I am not sure either.
Brian, I bet I know what have hit you as suffered from it 18 months ago when I had to upgrade AIX 5.3 to 6.1 for a customer. The short answer is: "nimadm -j". The long answer is the 32-bit legacy used by NIM to move the files from the NIM client to the master's cache VG. If there is a file larger than a 32-bit signed integer can accomodate (2GB-1B) in a rootvg filesystem the copy to the master is silently abandoned. If the file is buried deep in the FS tree, one will see the files/subdirs before it copied and all files/dirs in the directory and all its parents not copied. The workaround is to use `find <FS> -xdev` and to identify all files larger than 2GB and to move them to a non-root VG. In your case you have a large file in /opt. Move/archive/delete it and re-run nimadm.</FS>
Hi AndreKoonings, Sure! Take a good look at Versioned WPARs. https://www.ibm.com/developerworks/community/blogs/aixpert/entry/aix_5_3_within_a_versioned_wpar_just_give_it_ago_it_s_easy1?lang=en Cheers. Chris
Hi Chris, Try something different: Have you tried moving the AIX 5.3 partition to a WPAR? You can use an mksysb to recreate the system on a 7.1 lpar?
Hi omahony, Try installing the bos.alt_disk_install.rte for AIX 7.1 on the AIX 5.3 system. Then try the nimadm operation again. Cheers. Chris
Chris Apologies for Necro'ing a thread, but I have something similar i was hoping you might have an idea. I am migrating AIX5.3 TL12 SP7 to AIX7.1 TL2 SP3. I created a lppsource and spot from the 7100-01-04 dvds, and upgraded that to 7100-02-03. The command i am using: nimadm -j devnim7_vg -c devdb3 -s spot_7100-02-03_full -l lpp_source_7100-02-03_full -d "hdisk2 hdisk3" -YV I get no errors in it, and the migration seems fine. I am keeping a mirrored pair of both rootvg's. However when I reboot into the new version, almost everything in opt is missing. Including the freeware which has my ssh configs etc. I kicked this off a second time, and watched. During the cachevg, the /opt folder didnt contain any of the folders. However, it had almos 2GB free, so it wasnt a space issue. This is then visible later in the migration logs when it is calling stuff from freeware or rpm, but it just skips it and continues. Its pretty awkward as getting downtime on this is quite hard, and i cant wake up the 7.1 alt_disk from an AIX5.3 So two questions: 1. Any help on what may be the cause? 2. Any way to get nimadm to pause or stop after phase 3 so i dont have to go through the whole process.
One thing I will say however is.....perhaps a more meaningful error message such as '/bos_inst exists: please remove this directory and re-try the nimadm operation"....would be somewhat more useful than "alt_disk_copy: 0505-218 ATTENTION: init_multibos() returned an unexpected result." :) Cheers.
Thanks Anthony. Some very interesting tests. The same issue was found on several AIX 6.1 systems. I'm not sure if it's an issue with multibos or if there's an issue with the way the customer has used multibos in the past. I don't enough info to make a call either way.
Chris, the last line of your update from 14 Dec 2011 (about 'multibos -R' not cleaning up /bos_inst) made me curious, so I ran a test. My starting point was an AIX 7.1 system. The multibos cleanup operation looks to be fixed on AIX 7.1. I created a multibos instance on an AIX 7.1 logical partition (7100-01-00-0000) and logged into it successfully (multibos -s). The /bos_inst directory had several files and directories underneath it, as you'd expect. After logging out of the standby instance, the /bos_inst directory was there, but only contained the four mount points for the file systems that exist underneath /bos_inst /bos_inst # du 0 ./opt 0 ./usr 0 ./var 0 . Once again, no surprises. I then removed the multibos instance (multibos -R) and it removed ALL standby BOS file systems, including /bos_inst Removing all standby BOS file systems ... Removing standby BOS file system /bos_inst/opt Removing standby BOS file system /bos_inst/var Removing standby BOS file system /bos_inst/usr Removing standby BOS file system /bos_inst I then tried to change directory to /bos_inst and it was (thankfully) not there: cd /bos_inst ksh: /bos_inst: not found. This is on AIX 7.1, and since nimadm is only for going to a new AIX release, we'll have to wait for AIX 8 to come out to take advantage of this when migrating from AIX 7.1. Still, it looks to be solved. Next test: since AIX 6.1 TL 6 has many features backported from AIX 7.1, does multibos -R remove /bos_inst on AIX 6.1?