Replicated rootvg, ghostdev and NPIVUpdate, May 14th, 2019: It's been 7 years since I first published this post. Power Systems and PowerVM technology has progressed significantly since then. Hence the need for me to provide an update to this blog post. So, if you're planning on or you're already using Simplified Remote Restart (SRR) (and/or offline LPM), please do NOT enable ghostdev on your systems. If ghostdev is set to 1 and you attempt to use SRR (or offline LPM), the AIX LPAR will reset the ODM during boot. This is (most likely) not desired beahviour. If the ODM is cleared the system will need to be reconfigured so that TCP/IP and LVM are operational again. If you require a "ghostdev like" behaviour for your AIX disaster recovery (DR) process, I would recommend you set the sys0 attribute, clouddev, to 1, immediately after you have booted from your replicated rootvg. Rebooting your AIX system with this setting enabled will "Recreate ODM devices on next boot" and allow you to reconfigure your LPAR for DR. Once you've booted with clouddev=1 and reconfigured your AIX LPAR at DR, immediately disable clouddev (i.e. set it to 0, the default), so that the ODM is not cleared again on the next system reboot. Some more details on clouddev below.
"Details on the "ghostdev" version-2 function (aka clouddev) we need from AIX: The desired behavior we want to see occur in AIX to manage resetting device-id naming in the OS is:
1) A user (or script run when installing 'cloud-init') runs: set /usr/sbin/chdev -l sys0 -a clouddev=1 2) AIX then defines the NVRAM variable "ibm 3) During AIX startup of ODM, if this clouddev flag is set, AIX looks for an "ibm exist, it sets it to true. 4) If the "ibm "ibm
This solution leverages the fact that NVRAM data is preserved across Live Partition Migration but not across the image/instance IaaS lifecycle. It also means we are no longer dependent upon detecting different LPAR ids, i.e. we don't break if parent image is deployed to same lpar-id on same host. It also means we no longer have to toggle the ghostdev setting at boot or AE reset (the goal being no need to do any sort of reset, be able to do capture anytime)."
If you are looking for a more modern and automated solution for your AIX DR, I would highly recommend you take a look at the IBM VM Recovery Manager for IBM Power Systems. "Streamline site switches with a more economical, automated, easier to implement high availability and disaster recovery solution for IBM Power Systems."
http
---- Original Post from 2012 below ----
My team and I have recently been trying to stream line our AIX disaster recovery process. We’ve been looking for ways to reduce our overall recovery time. Several ideas were tossed around such as a) using a standby DR LPAR with AIX already installed and using rsync/scp to keep the Prod & DR LPARs in sync and b) using alt_disk_copy (with the –O flag for a device reset) to clone rootvg to an alternate disk which is then replicated to DR. These methods may work but are cumbersome to administer and (in the case of alt_disk_copy) require additional (permanent) resources on every production system. With over 120 production instances of AIX, the disk space requirements start to add up.
So far we’ve concluded that the best way to achieve our goal is by using SAN replicated rootvg volumes at our DR site.
Our current DR process relies on recovery of AIX systems from mksysb images from a NIM master. All our data (non-rootvg) LUNs are already replicated to our DR site. The aim was to change the process and ‘recover’ our AIX images using replicated rootvg LUNs. This will reduce our overall recovery time at DR (which is crucial if we are to meet the proposed recovery time objectives set by our business). Based on current IBM documentation we were relatively comfortable with the proposed approach. The following IBM developerWorks article (originally published in 2009 and updated in late 2010) describes “scenarios in which remapping, copying, and reuse of SAN disks is allowed and supported. More easily switch AIX environments from one system to another and help achieve higher availability and reduced down time. These scenarios also allow for fast deployment of new systems using cloning.” The document focuses on fully virtualised environments that utilise shared processors and VIO servers. One area where this document is currently lacking in information is the use of NPIV and virtual fibre channel adapters in a DR scenario. We reached out to our contacts in the AIX development space and asked the following question:
“Hoping
you can help us find some statements regarding support for a replicated
rootvg environment using NPIV/Virtual Fibre Channel adapters? We received the following responses:
“Development
has approved using NPIV to do this for one customer. Below are more
detailed requirements for this DR strategy using NPIV. If a Disaster
Recovery (DR) environment not using PowerHA Enterprise Edition is
used, then we believe the white paper located http
We have the following guidelines regarding the configuration, which includes NPIV as an option: All of the system configuration should be virtualized, with the possible exception of disk devices when using NPIV. If NPIV is used then AIX MPIO must be used as the multi-pathing solution. If multi-pathing software is used besides AIX MPIO, then the vendor of that software must be contacted regarding a support statement. Install AIX (at least minimum required TLs/SPs for desired AIX version) and software stack (Middleware and applications) on primary systems, which is compatible with systems at both sites. Primary and secondary sites should be using systems with similar hardware, same microcode levels, and same VIOS levels.
Many manual steps are needed to setup the virtual and physical devices accurately on secondary site VIOS. If Virtual SCSI Disks are being used, then discover the unique identification on the primary site and map the disks to the corresponding replication disks. Map the same appropriately on VIOS on secondary site. Level of VIOS should support attribute to open the secondary devices passively. This setting needs to be setup correctly on the VIOS on secondary site. Operating environment should not have subnet dependencies. Manage the replication relationships accurately. Manually may need to switch the secondary disks to primary node. Raw disk usage may cause problems. Some middle ware products may bypass Operating system and use the disk directly. They might have their own restrictions for this environment. (e.g. anything that is device location code or storage LUN unique ID dependant may have issues when the cloned image is restarted on the secondary system with replicated storage).
Set the "ghostdev" attribute using chdev command (must be done on the primary). This attribute can be set using the command "chdev -l sys0 –a ghostdev=1". The ghostdev attribute will delete customized ODM database on rootvg when AIX detects it has booted from a different LPAR or system. If the "ghostdev" attribute is not set, then the booting from the alternate site will result in devices in ODM showing up in "Defined" or "Missing" state.
After a failover to secondary site, may need to reset the boot device list for each LPAR before the boot of the LPAR using SMS menus of the firmware Note that this is not an exhaustive list of issues. Refer to the white paper and study the same as it applies to the environment. So as long as they are using MPIO, you're OK. If not using MPIO and some OEM storage, then the storage vendor must also support it.”
“Unsupported Methods
So, as a secondary objective, we have been working closely with our local IBM representatives to obtain some surety from IBM that our proposed DR strategy for AIX is fully supported by both the AIX development and support teams.
With that in mind I’ll provide an overview of our new DR approach and hope that it offers others insight to alternative method for recovery and also to assist IBM in further understanding what some of the “larger” AIX customers are looking for in terms of simplified AIX disaster recovery.
What follows is a detailed description of our IBM AIX, PowerVM/Power Systems environment, the proposed recovery steps and other items for consideration.
Our Environment:
· - AIX 5.3*, 6.1 & 7.1 - 5300-12-04-1119, 6100-06-05-1115 and 7100-01-04-1216. · - VIOS 2.2.1.3 – both production and DR. · - VIOS physical FC adapters – both production and DR - 8Gb PCI Express Dual Port FC Adapter (5735) – Firmware level: 2.00X7. · - All client LPARs utilise Virtual FC adapters (NPIV) for disk storage. · - Production: POWER7 795 (9119-FHB). Firmware level: AH730_078. · - DR: POWER6 595 (9119-FHA). Firmware level: EH350_120. · - HDS VSP storage – at both production and DR sites. · - AIX MPIO only – HDS ODM driver is installed (dev · - Production: Dual VIOS. · - DR: Single VIOS. · - Production: LUNs are mapped directly to client LPARs via NPIV, virtual FC adapter pass thru via dual VIOS. MPIO is in use i.e. one path per VIOS/Physical FC. · - DR: LUNs are mapped directly to client LPARs via NPIV, virtual FC adapter pass thru via a single VIOS. i.e. single path only. · - Please refer to the following table for a summary of the environment details.
Our Recovery Procedure:
1. - Change the sys0 ghostdev attribute value to 1 on the source production AIX system. Set the "ghostdev" attribute using chdev command (must be done on the primary). This attribute can be set using the command "chdev -l sys0 –a ghostdev=1". The ghostdev attribute will delete customized ODM database on rootvg when AIX detects it has booted from a different LPAR or system. 2. - Take note of the source systems rootvg hdisk PVID. 3. - Select the source production rootvg LUN for replication on the HDS VSP. 4. - Replicate the LUN from the production site to the DR HDS VSP. 5. - In a DR test, suspend HDS replication from production to DR. 6. - Assign the replicated LUN to the target LPAR on the DR POWER6 595 i.e. map the LUN to the WWPN of the virtual FC adapter on the DR LPAR. 7. - Attempt to boot the DR LPAR using the replicated rootvg LUN. If necessary, enter SMS menu to update the boot list i.e. select the correct boot disk, check for the same PVID as the source host. 8. - Once the LPAR has successfully booted, the AIX administrator would configure the necessary devices i.e. import data volume groups, configure network interfaces, etc. This may also be scripted for execution during the first boot process. 9. - Please refer to the following diagrams for a visual representation of the proposed process.
Some Notes/Caveats:
The following is a list of items that we understand are possible limitations and issues with our new DR process.
· - Booting from replicated rootvg disks may fail for several reasons, such as, a) there is unexpected corruption in the replicated LUN image due to rootvg not being quiesced during replication or b) there is a unidentified issue with the AIX system that is only apparent the next time the system is booted; this could be mis-configuration by the administrator or some other unforseen problem.
· - In the event that an LPAR fails to boot via a replicated rootvg LUN, a backup method is available for recovery. Switching back to manual NIM mksysb restore provides a sufficient backup should the replicated rootvg be unusable.
· - If the "ghostdev" attribute is not set, then booting from the DR site will result in devices in the ODM showing up in a "Defined" or "Missing" state.
· - Once a DR test is completed, the DR LPAR should be de-activated immediately so that SAN disk replication can be restarted between production and DR. Failure to perform this step may result in the DR LPAR failing as a result of file system corruption.
· - At present we are using AIX MPIO only. There is discussion of using HDLM in the future. We will contact HDS for a support statement regarding booting from replicated rootvg LUNs with HDLM installed.
· - The ghostdev attribute is not implemented in AIX 5.3. AIX 5.3 is no longer supported*.
So far all of our testing has been successful. We verified that we could replicate an SOE rootvg image of AIX 6.1 and 7.1 to DR and successfully boot an LPAR using the replicated disk. Based on these tests there doesn’t appear to be anything stopping us from using this method for DR purposes. The following table outlines the different versions of AIX we tested and the results.
Once the system was booted we needed to perform some post boot configuration tasks. These tasks were handled by two scripts that were called from /etc/inittab. On the source system we installed the new scripts (in /etc) and added new entries to the /etc/inittab file. These scripts only run if the systemid matches that of the DR systemid. Note: Only partial contents of each script are shown below…but you get the idea.
# mkitab -i srcmstr "AIX # mkitab -i srcmstr "AIX # lsitab -a | grep DR AIXD AIXD
AIX_DRconfig.ksh:
#!/bin/ksh #set -xv #### # # This script will configure the AIX environment for DR purposes. # This script will only run if the systemid matches the systemid of the DR system. # aixdr : / # lsattr -EOl sys0 -a systemid | grep -v systemid | awk -F, '{print $2}' # # 0211D11C1 = DR POWER System # ####
MYNAME=$(basename $0) DR_S LSAT
if [ "$LS then #Set the DR hostname, IP address, netmask and gateway. #mktcpip -h <hostname> -a <IP to use in DR> -m <network mask> -i en0 -g <gateway IP> #en0
dr_hostname=`cat /usr dr_defgw=`cat /usr dr_intf=`cat /usr dr_ip=`cat /usr dr_netm=`cat /usr
echo "$MYNAME: Setting DR hostname and IP address." mktcpip -h $dr_hostname -a $dr_ip -m $dr_netm -i $dr_intf -g $dr_defgw
#Configure IP aliases #chdev -l en0 -a alia #Remove IP aliases #chdev -l en0 -a dela
for i in `cat /usr do chdev -l $dr_intf -a alias4=$i done
for inet in `cat dr_inet.txt | grep -v en0` do for i in `cat /usr do chdev -l $inet -a alias4=$i done done
#Configure the AIX environment for DR. echo "$MYNAME: Configure the AIX environment for DR."
echo #Delete the MDC DNS entries echo "$MYNAME: Configuring DNS for DR." namerslv -d -i 10.1.6.21 namerslv -d -i 10.1.4.21 namerslv -d -i 10.1.5.21
#Add the DR DNS entry namerslv -a -i 10.1.7.38
echo echo "$MYNAME: Configuring bootlist for DR." bootlist -m normal -o bootlist -m normal hdisk0 bootlist -m normal -o echo echo "$MYNAME: Configuring /etc/inittab for DR." lsitab -a | grep nim chitab "nim lsitab -a | grep nim echo
else MYNAME=$(basename $0) echo "$MYNAME: The systemid $LSATTR_SYSTEMID_DR does not match the expected DR systemid of $DR_SYSTEMID." echo "$MYNAME: This script should only be executed at DR." echo "$MYNAME: If you are not booting the system at the DR site, then you can ignore this message." echo "$MYNAME: No changes have been perfomed. Script is exiting." fi
The ghostdev attribute essentially provides us with a clean ODM and allows the system to discover new devices and build the ODM from scratch. If you attempt to boot from a replicated rootvg disk without first setting the ghostdev attribute, your system may fail to boot (hang at LED 554) because of a new device tree and/or missing devices. You might be able to recover from this situation (without restoring from mksysb) by performing the steps outlined on pages 16-20 of the following document (thanks to Dominic Lancaster at IBM for the presentation).
|
Great post Chris.
I'm trying to make it useful for configuring IBM GDR.
GDR itself works very fine with recreating LPAR, creating VIOS mappings, reversing replication direction.
But there is important problem - after DR operation I must reconfigure my machine. Most important is IP configuration.
I
can see that using ghostdev may be good - I suppose it's better to have
working hdisk0-hdisk9 instead of defined hdisk0-hdisk9 plus available
hdisk10-hdisk19, but I'll think of it more.
Final case is post-DR config. You use scripts in /etc/inittab. You provided one here - could you please provide second one also?
You refer to files like /usr/local/dr/dr_en0.txt - what are these files? Just "interface IP mask" or something like this?
Thanks in advance.
Thanks Chris for the wonderful article .
But we have issue to recover our Power HA 7.1.3 environment in DR .
We restored the mksysb and other Luns we replicated ..but the caa was not coming up .
Could you please help with the better procedure to recover PowerHA in DR with replicated caa and data vg disks .
Hi, there is a way to recover CAA. Send me an email (cgibson@au1.ibm.com) and I will share the details with you. Cheers, Chris.
Procedure for using bitwise copy for creating new servers:
It used to be that we will do the following to create, say server2 from
server1:
nim mksysb server1
nim define machine object server2
nim standalone server installation to server2 using mksyb_server1
Add the default route to server2
We are now using the following process for the cloning:
on server1:
chdev -l sys0 -a ghostdev=2
create a cloned copy of the rootvg disks using some SAN function.
assign the new clone disks to a new LPAR
boots up the new LPAR
chdev -l inet0 -a hostname=server2 -a route=default_route_settings
chdev -l en0 -a netaddr=server2_ip_address -a netmask=server2_netmask
/usr/sbin/rsct/install/bin/recfgct
/usr/sbin/rsct/bin/rmcctrl -z
/usr/sbin/rsct/bin/rmcctrl -A
/usr/sbin/rsct/bin/rmcctrl -p
With the above process, I confirmed that for server2, its pvid is
changed and DLAR functions work on it.
I am still a little bit concerned that the above process may not be
equivalent to a mksysb backup and restore. Can you confirm it? Did I
miss anything?
Thank you for your article that helped me a lot on understanding the replication of rootvg.
To improve our RTO, I'm looking for a solution that would also create system requirements for the HMC LPAR and VIO LPAR
A solution such as RSS (SIMPLIFIED REMOTE RESTART) may be appropriate,
but it seems to me that the FSP production should be reached, limiting
the value of RSS in case of crash..
An idea ?
Yes, you are correct. The FSP must be available for RR to work. You could write a script that sync's/creates the LPAR profiles at the DR site.
Did
anyone try this in a CAA environment ? I mean in a Power HA 7.1
sysmirror. I believe the CAA cluster will not work in both cases :
1. If ghotdev is set, the ODM is wiped which in turn will remove the CAA cluster too.
2.
If ghostdev is not set, the CAA replicated disk will have a different
logical device name and UDID, which will disturb the CAA cluster.
However, the nodeid has to be regenerated which can be done without much hassles.
Do we have a way out to get the Power HA replicated on the DR as it is ?
Im impressed how they handle their support. Had several issues to fix, and they were glad to help me with my server ( http://www.spectra.com/hitachi/used-system/107/index.htm ) issue.
We generate new WWPNs at the DR site. Thanks for sharing James. Very interesting setup and I 'm glad its all working OK for you.
So are the same WWPN's preserved for the Primary/DR partition? My biggest issue was creating the NPIV devices on the DR hardware from a replicated rootvg which generated new device WWPN's that had to be zoned seperately to all of the storage devices. I'm in the process of deploying our first AIX servers for our organisation for TSM v6.3 and essentially we're doing the same thing on AIX 7.1.... Redundant VIOS at each site with redundant SAN per VIOS, all data LUNS including rootvg are SAN attached using SVC vdisks, with syncronoush metro-mirroring to DR. Our test so far has included an AIX 7.1 client lpar running TSM v6.3.3.0 with replicated san attached VSCSI for rootvg, NPIV VFC attached data luns, NPIV VFC attached 3592 libraries (22 drives in total), plus a NPIV VFC attached Protectier (16 virtual LTO) TS7650G. Using SDDPCM and IBM atape control path / data path failover, I can shutdown the TSM server, break the replication and power on the DR partition and not lose any data in the DB2 db, or on a file storage pool that had not yet been migrated to tape and it all comes online at the DR site in 5 minutes without the need to restore a TSM DB or an AIX Operating system from backups. I also, tested changes to the client at the DR site and replicating back and it all works ok. Although I didin't know about ghostdev, but the rootvg is presented via the two VIOS as VSCSI (not NPIV VFC) and had no issues booting, I did notice that the hdisk count increased, but all the VG's sorted themselves out fine. Stale dev's came back online as the original device id's once I failed back to the prod site.
Great post Chris. I hadn't heard about the ghostdev attribute before. It will definitly come in handy.
Excellent article chris. Some good pointers and a new attribute to use