Replicated rootvg, ghostdev and NPIV (Chris's AIX Blog)

The developerWorks Connections platform will be sunset on December 31, 2019. On January 1, 2020, this blog will no longer be available. More details available on our FAQ.

Replicated rootvg, ghostdev and NPIV

cggibbo Sep 17 2012 Comments (14) Visits (56715)

1 person likes this

Update, May 14th, 2019: It's been 7 years since I first published this post. Power Systems and PowerVM technology has progressed significantly since then. Hence the need for me to provide an update to this blog post. So, if you're planning on or you're already using Simplified Remote Restart (SRR) (and/or offline LPM), please do NOT enable ghostdev on your systems. If ghostdev is set to 1 and you attempt to use SRR (or offline LPM), the AIX LPAR will reset the ODM during boot. This is (most likely) not desired beahviour. If the ODM is cleared the system will need to be reconfigured so that TCP/IP and LVM are operational again. If you require a "ghostdev like" behaviour for your AIX disaster recovery (DR) process, I would recommend you set the sys0 attribute, clouddev, to 1, immediately after you have booted from your replicated rootvg. Rebooting your AIX system with this setting enabled will "Recreate ODM devices on next boot" and allow you to reconfigure your LPAR for DR. Once you've booted with clouddev=1 and reconfigured your AIX LPAR at DR, immediately disable clouddev (i.e. set it to 0, the default), so that the ODM is not cleared again on the next system reboot. Some more details on clouddev below.

"Details on the "ghostdev" version-2 function (aka clouddev) we need from AIX:

The desired behavior we want to see occur in AIX to manage resetting device-id naming in the OS is:

1) A user (or script run when installing 'cloud-init') runs: set /usr/sbin/chdev -l sys0 -a clouddev=1

2) AIX then defines the NVRAM variable "ibm,aix-clearODMdata" and sets it to false when that chdev command is run.

3) During AIX startup of ODM, if this clouddev flag is set, AIX looks for an "ibm,aix-clearODMdata" NVRAM variable and if it does not

exist, it sets it to true.

4) If the "ibm,aix-clearODMdata" flag is set to true during that boot, ODM will then do the same ghostdev-style cleanup and set

"ibm,aix-clearODMdata" to false when finished

This solution leverages the fact that NVRAM data is preserved across Live Partition Migration but not across the image/instance IaaS

lifecycle. It also means we are no longer dependent upon detecting different LPAR ids, i.e. we don't break if parent image is

deployed to same lpar-id on same host. It also means we no longer have to toggle the ghostdev setting at boot or AE reset (the goal

being no need to do any sort of reset, be able to do capture anytime)."

If you are looking for a more modern and automated solution for your AIX DR, I would highly recommend you take a look at the IBM VM Recovery Manager for IBM Power Systems. "Streamline site switches with a more economical, automated, easier to implement high availability and disaster recovery solution for IBM Power Systems."

https://www.ibm.com/au-en/marketplace/vm-recovery-manager

---- Original Post from 2012 below ----

My team and I have recently been trying to stream line our AIX disaster recovery process. We’ve been looking for ways to reduce our overall recovery time. Several ideas were tossed around such as a) using a standby DR LPAR with AIX already installed and using rsync/scp to keep the Prod & DR LPARs in sync and b) using alt_disk_copy (with the –O flag for a device reset) to clone rootvg to an alternate disk which is then replicated to DR. These methods may work but are cumbersome to administer and (in the case of alt_disk_copy) require additional (permanent) resources on every production system. With over 120 production instances of AIX, the disk space requirements start to add up.

So far we’ve concluded that the best way to achieve our goal is by using SAN replicated rootvg volumes at our DR site.

Our current DR process relies on recovery of AIX systems from mksysb images from a NIM master. All our data (non-rootvg) LUNs are already replicated to our DR site. The aim was to change the process and ‘recover’ our AIX images using replicated rootvg LUNs. This will reduce our overall recovery time at DR (which is crucial if we are to meet the proposed recovery time objectives set by our business). Based on current IBM documentation we were relatively comfortable with the proposed approach. The following IBM developerWorks article (originally published in 2009 and updated in late 2010) describes “scenarios in which remapping, copying, and reuse of SAN disks is allowed and supported. More easily switch AIX environments from one system to another and help achieve higher availability and reduced down time. These scenarios also allow for fast deployment of new systems using cloning.”

The document focuses on fully virtualised environments that utilise shared processors and VIO servers. One area where this document is currently lacking in information is the use of NPIV and virtual fibre channel adapters in a DR scenario. We reached out to our contacts in the AIX development space and asked the following question:

“Hoping you can help us find some statements regarding support for a replicated rootvg environment using NPIV/Virtual Fibre Channel adapters?

The following IBM developerWorks article discusses VSCSI and we are looking for something similar for NPIV.

http://www.ibm.com/developerworks/aix/library/au-AIX_HA_SAN/index.html

My guess is that restrictions similar to those for physical FC adapters will apply here? But I'm hoping that given the adapters are virtual the limitations may be relaxed.

Are you aware of any statement regarding support (or not) for booting from another system using a disk subsystem image of rootvg replicated to another disk subsystem when using NPIV? And what, if any, additional requirements/restrictions may apply when using NPIV?”

We received the following responses:

“There are some additional considerations when using NPIV for booting from a replicated rootvg. With NPIV the client partitions has virtual Fibre Channel adapter ports, but has physical access to the actual (physical) disk devices. There may be an increased chance of needing to update the boot list via the Open Firmware SMS menu. Since the clients have access to the actual disks, you have the possibility of running multipathing software besides AIX MPIO. If you are using a multipathing software to manage the NPIV attached disks besides AIX MPIO, then you should contact the vendor that provided the software to check their support statement,

Since one or more of the physical devices will change when booting from an NPIV replicated rootvg, it is recommend to set the ghostdev attribute. The ghostdev attribute will trigger when it detects the AIX image is booting from either a different partition or server. Ghostdev attribute should not trigger during LPM operations (Live Partition Mobility). Once triggered, ghostdev will clear the customized ODM database. This will cause detected devices to be discovered as new devices (with default settings), and avoid the issue with missing/stale device entries in ODM. Since ghostdev does clear the entire customized ODM database, this will require you import your data (non-rootvg) volume groups again, and perform any (device) attribute customization. To set ghostdev, run "chdev -l sys0 -a ghostdev=1". Ghostdev must be set before the rootvg is replicated.

Similar with virtual devices, the client partition is booting an existing rootvg where the hardware may be different. It's possible some applications have dependency on tracking the actual physical devices (instead of data on the disks). For example PowerHA, may keep track of a disk for a cluster health checks. If you do have applications that have a dependency on tracking physical devices, then additional setup (of those applications) may be required after the first boot from the replicated rootvg.

We do have multiple customers using NPIV for such scenarios. I believe most of them worked with IBM Lab Based Services to assist with implementing such a configuration, and some of the customers required some custom scripts to further customize their system after booting from the replicated rootvg. Those customers set the ghostdev attribute, and had custom scripts to import their data (non-rootvg) volume groups, and update PowerHA to point to the new health check disk.

You should get support for such an NPIV setup with IBM as long as you follow the considerations listed in the WhitePaper.”

“Development has approved using NPIV to do this for one customer. Below are more detailed requirements for this DR strategy using NPIV. If a Disaster Recovery (DR) environment not using PowerHA Enterprise Edition is used, then we believe the white paper located http://www.ibm.com/developerworks/aix/library/au-AIX_HA_SAN/index.html provides the guidelines in regards to setup, prerequisites, and as well as the limitations of such a DR deployment. Deployments as detailed in the white paper are supported by IBM. However note that such a deployment has many manually instituted responsibilities on the customer to setup and maintain such an environment. IBM expects that the customer carefully manage these manual steps without any mistakes. The White paper does not currently cover using NPIV in such a DR scenario.

We have the following guidelines regarding the configuration, which includes NPIV as an option: All of the system configuration should be virtualized, with the possible exception of disk devices when using NPIV. If NPIV is used then AIX MPIO must be used as the multi-pathing solution. If multi-pathing software is used besides AIX MPIO, then the vendor of that software must be contacted regarding a support statement. Install AIX (at least minimum required TLs/SPs for desired AIX version) and software stack (Middleware and applications) on primary systems, which is compatible with systems at both sites. Primary and secondary sites should be using systems with similar hardware, same microcode levels, and same VIOS levels.

Many manual steps are needed to setup the virtual and physical devices accurately on secondary site VIOS. If Virtual SCSI Disks are being used, then discover the unique identification on the primary site and map the disks to the corresponding replication disks. Map the same appropriately on VIOS on secondary site. Level of VIOS should support attribute to open the secondary devices passively. This setting needs to be setup correctly on the VIOS on secondary site. Operating environment should not have subnet dependencies. Manage the replication relationships accurately. Manually may need to switch the secondary disks to primary node. Raw disk usage may cause problems. Some middle ware products may bypass Operating system and use the disk directly. They might have their own restrictions for this environment. (e.g. anything that is device location code or storage LUN unique ID dependant may have issues when the cloned image is restarted on the secondary system with replicated storage).

Set the "ghostdev" attribute using chdev command (must be done on the primary). This attribute can be set using the command "chdev -l sys0 –a ghostdev=1". The ghostdev attribute will delete customized ODM database on rootvg when AIX detects it has booted from a different LPAR or system. If the "ghostdev" attribute is not set, then the booting from the alternate site will result in devices in ODM showing up in "Defined" or "Missing" state.

After a failover to secondary site, may need to reset the boot device list for each LPAR before the boot of the LPAR using SMS menus of the firmware Note that this is not an exhaustive list of issues. Refer to the white paper and study the same as it applies to the environment. So as long as they are using MPIO, you're OK. If not using MPIO and some OEM storage, then the storage vendor must also support it.”

While these responses indicate that this form of recovery is supported by IBM, we were still looking to IBM for clarity on the support position. It has been noted that other IBM customers have had mixed responses when contacting AIX support for feedback and assistance with this type of DR procedure. And it’s not hard to see why when you read statements like this from the “Supported Methods of Duplicating an AIX System” document:

“Unsupported Methods

1. Using a bitwise copy of a rootvg disk to another disk.

This bitwise copy can be a one-time snapshot copy such as flashcopy, from one disk to another, or a continuously-updating copy method, such as Metro Mirror.

While these methods will give you an exact duplicate of the installed AIX operating system, the copy of the OS may not be bootable. A typical scenario where this is tried is when one system is a production host and there is a desire to create a duplicate system at a disaster recovery site in a remote location.

2. Removing the rootvg disks from one system and inserting into another.

This also applies to re-zoning SAN disks that contain the rootvg so another host can see them and attempt to boot from them.

Why don't these methods work?

The reason for this is there are many objects in an AIX system that are unique to it; Hardware location codes, World-Wide Port Names, partition identifiers, and Vital Product Data (VPD) to name a few. Most of these objects or identifiers are stored in the ODM and used by AIX commands.

If a disk containing the AIX rootvg in one system is copied bit-for-bit (or removed), then inserted in another system, the firmware in the second system will describe an entirely different device tree than the AIX ODM expects to find, because it is operating on different hardware. Devices that were previously seen will show missing or removed, and usually the system will typically fail to boot with LED 554 (unknown boot disk).”

So, as a secondary objective, we have been working closely with our local IBM representatives to obtain some surety from IBM that our proposed DR strategy for AIX is fully supported by both the AIX development and support teams.

With that in mind I’ll provide an overview of our new DR approach and hope that it offers others insight to alternative method for recovery and also to assist IBM in further understanding what some of the “larger” AIX customers are looking for in terms of simplified AIX disaster recovery.

What follows is a detailed description of our IBM AIX, PowerVM/Power Systems environment, the proposed recovery steps and other items for consideration.

Our Environment:

· - AIX 5.3*, 6.1 & 7.1 - 5300-12-04-1119, 6100-06-05-1115 and 7100-01-04-1216.

· - VIOS 2.2.1.3 – both production and DR.

· - VIOS physical FC adapters – both production and DR - 8Gb PCI Express Dual Port FC Adapter (5735) – Firmware level: 2.00X7.

· - All client LPARs utilise Virtual FC adapters (NPIV) for disk storage.

· - Production: POWER7 795 (9119-FHB). Firmware level: AH730_078.

· - DR: POWER6 595 (9119-FHA). Firmware level: EH350_120.

· - HDS VSP storage – at both production and DR sites.

· - AIX MPIO only – HDS ODM driver is installed (devices.fcp.disk.Hitachi.array.mpio.rte) – HDLM is not installed.

· - Production: Dual VIOS.

· - DR: Single VIOS.

· - Production: LUNs are mapped directly to client LPARs via NPIV, virtual FC adapter pass thru via dual VIOS. MPIO is in use i.e. one path per VIOS/Physical FC.

· - DR: LUNs are mapped directly to client LPARs via NPIV, virtual FC adapter pass thru via a single VIOS. i.e. single path only.

· - Please refer to the following table for a summary of the environment details.

Our Recovery Procedure:

1. - Change the sys0 ghostdev attribute value to 1 on the source production AIX system. Set the "ghostdev" attribute using chdev command (must be done on the primary). This attribute can be set using the command "chdev -l sys0 –a ghostdev=1". The ghostdev attribute will delete customized ODM database on rootvg when AIX detects it has booted from a different LPAR or system.

2. - Take note of the source systems rootvg hdisk PVID.

3. - Select the source production rootvg LUN for replication on the HDS VSP.

4. - Replicate the LUN from the production site to the DR HDS VSP.

5. - In a DR test, suspend HDS replication from production to DR.

6. - Assign the replicated LUN to the target LPAR on the DR POWER6 595 i.e. map the LUN to the WWPN of the virtual FC adapter on the DR LPAR.

7. - Attempt to boot the DR LPAR using the replicated rootvg LUN. If necessary, enter SMS menu to update the boot list i.e. select the correct boot disk, check for the same PVID as the source host.

8. - Once the LPAR has successfully booted, the AIX administrator would configure the necessary devices i.e. import data volume groups, configure network interfaces, etc. This may also be scripted for execution during the first boot process.

9. - Please refer to the following diagrams for a visual representation of the proposed process.

Some Notes/Caveats:

The following is a list of items that we understand are possible limitations and issues with our new DR process.

· - Booting from replicated rootvg disks may fail for several reasons, such as, a) there is unexpected corruption in the replicated LUN image due to rootvg not being quiesced during replication or b) there is a unidentified issue with the AIX system that is only apparent the next time the system is booted; this could be mis-configuration by the administrator or some other unforseen problem.

· - In the event that an LPAR fails to boot via a replicated rootvg LUN, a backup method is available for recovery. Switching back to manual NIM mksysb restore provides a sufficient backup should the replicated rootvg be unusable.

· - If the "ghostdev" attribute is not set, then booting from the DR site will result in devices in the ODM showing up in a "Defined" or "Missing" state.

· - Once a DR test is completed, the DR LPAR should be de-activated immediately so that SAN disk replication can be restarted between production and DR. Failure to perform this step may result in the DR LPAR failing as a result of file system corruption.

· - At present we are using AIX MPIO only. There is discussion of using HDLM in the future. We will contact HDS for a support statement regarding booting from replicated rootvg LUNs with HDLM installed.

· - The ghostdev attribute is not implemented in AIX 5.3. AIX 5.3 is no longer supported*.

So far all of our testing has been successful. We verified that we could replicate an SOE rootvg image of AIX 6.1 and 7.1 to DR and successfully boot an LPAR using the replicated disk. Based on these tests there doesn’t appear to be anything stopping us from using this method for DR purposes. The following table outlines the different versions of AIX we tested and the results.

Once the system was booted we needed to perform some post boot configuration tasks. These tasks were handled by two scripts that were called from /etc/inittab. On the source system we installed the new scripts (in /etc) and added new entries to the /etc/inittab file. These scripts only run if the systemid matches that of the DR systemid. Note: Only partial contents of each script are shown below…but you get the idea.

# mkitab -i srcmstr "AIXDRimportvg:2:once:/etc/AIX_DRimportvg.ksh > /dev/console 2>&1 #Reconfigure the system for DR on boot."

# mkitab -i srcmstr "AIXDRconfig:2:once:/etc/AIX_DRconfig.ksh > /dev/console 2>&1 #Reconfigure the system for DR on boot."

# lsitab -a | grep DR

AIXDRconfig:2:once:/etc/AIX_DRconfig.ksh > /dev/console 2>&1 #Reconfigure the system for DR on boot.

AIXDRimportvg:2:once:/etc/AIX_DRimportvg.ksh > /dev/console 2>&1 #Reconfigure the system for DR on boot.

AIX_DRconfig.ksh:

#!/bin/ksh

#set -xv

#################################################################

# This script will configure the AIX environment for DR purposes.

# This script will only run if the systemid matches the systemid of the DR system.

# aixdr : / # lsattr -EOl sys0 -a systemid | grep -v systemid | awk -F, '{print $2}'

# 0211D11C1 = DR POWER System

#################################################################

MYNAME=$(basename $0)

DR_SYSTEMID=0211D11C1

LSATTR_SYSTEMID_DR=`lsattr -EOl sys0 -a systemid | grep -v systemid | awk -F, '{print $2}'`

if [ "$LSATTR_SYSTEMID_DR" = "$DR_SYSTEMID" ]

then

#Set the DR hostname, IP address, netmask and gateway.

#mktcpip -h <hostname> -a <IP to use in DR> -m <network mask> -i en0 -g <gateway IP>

#en0:10.1.5.99:255.255.252.0:10.1.1.10,255.255.255.0 10.2.2.10,255.255.255.0

dr_hostname=`cat /usr/local/dr/dr_hostname.txt`

dr_defgw=`cat /usr/local/dr/dr_defgw.txt`

dr_intf=`cat /usr/local/dr/dr_en0.txt | awk -F: 'print $1}'`

dr_ip=`cat /usr/local/dr/dr_en0.txt | awk -F: 'print $2}'`

dr_netm=`cat /usr/local/dr/dr_en0.txt | awk -F: 'print $3}'`

echo "$MYNAME: Setting DR hostname and IP address."

mktcpip -h $dr_hostname -a $dr_ip -m $dr_netm -i $dr_intf -g $dr_defgw

#Configure IP aliases

#chdev -l en0 -a alias4=10.1.1.10,255.255.255.0

#Remove IP aliases

#chdev -l en0 -a delalias4=10.1.1.10,255.255.255.0

for i in `cat /usr/local/dr/dr_en0.txt | awk -F: '{print $4}'`

chdev -l $dr_intf -a alias4=$i

done

for inet in `cat dr_inet.txt | grep -v en0`

for i in `cat /usr/local/dr/dr_$inet.txt | awk -F: '{print $4}'`

chdev -l $inet -a alias4=$i

done

#Configure the AIX environment for DR.

echo "$MYNAME: Configure the AIX environment for DR."

echo

#Delete the MDC DNS entries

echo "$MYNAME: Configuring DNS for DR."

namerslv -d -i 10.1.6.21

namerslv -d -i 10.1.4.21

namerslv -d -i 10.1.5.21

#Add the DR DNS entry

namerslv -a -i 10.1.7.38

echo

echo "$MYNAME: Configuring bootlist for DR."

bootlist -m normal -o

bootlist -m normal hdisk0

bootlist -m normal -o

echo

echo "$MYNAME: Configuring /etc/inittab for DR."

lsitab -a | grep nim

chitab "nimclient:2:off:/usr/sbin/nimclient -S running > /dev/console 2>&1 # inform nim we're running"

lsitab -a | grep nim

echo

else

MYNAME=$(basename $0)

echo "$MYNAME: The systemid $LSATTR_SYSTEMID_DR does not match the expected DR systemid of $DR_SYSTEMID."

echo "$MYNAME: This script should only be executed at DR."

echo "$MYNAME: If you are not booting the system at the DR site, then you can ignore this message."

echo "$MYNAME: No changes have been perfomed. Script is exiting."

The ghostdev attribute essentially provides us with a clean ODM and allows the system to discover new devices and build the ODM from scratch. If you attempt to boot from a replicated rootvg disk without first setting the ghostdev attribute, your system may fail to boot (hang at LED 554) because of a new device tree and/or missing devices. You might be able to recover from this situation (without restoring from mksysb) by performing the steps outlined on pages 16-20 of the following document (thanks to Dominic Lancaster at IBM for the presentation).

https://www.ibm.com/developerworks/mydeveloperworks/blogs/cgaix/resource/IBMSTGEducationSymposium_AX61_POWERVM_P2V.pdf

Tags: rootvg boot chris lpar support led replication chris_gibson virtual npiv 554 hang san ghostdev replicated fibre disaster aix recovery hung fc vfc hds channel gibson dr vsp

Comments (14)

Add a Comment

Quarantine this Entry

kareem33 commented Nov 14 2018 Comment Permalink

Great post Chris.
I'm trying to make it useful for configuring IBM GDR.
GDR itself works very fine with recreating LPAR, creating VIOS mappings, reversing replication direction.
But there is important problem - after DR operation I must reconfigure my machine. Most important is IP configuration.
I can see that using ghostdev may be good - I suppose it's better to have working hdisk0-hdisk9 instead of defined hdisk0-hdisk9 plus available hdisk10-hdisk19, but I'll think of it more.
Final case is post-DR config. You use scripts in /etc/inittab. You provided one here - could you please provide second one also?
You refer to files like /usr/local/dr/dr_en0.txt - what are these files? Just "interface IP mask" or something like this?
Thanks in advance.

Nazar_KA commented Jan 14 2017 Conversation Permalink

Thanks Chris for the wonderful article .
But we have issue to recover our Power HA 7.1.3 environment in DR .

We restored the mksysb and other Luns we replicated ..but the caa was not coming up .
Could you please help with the better procedure to recover PowerHA in DR with replicated caa and data vg disks .

cggibbo commented Jan 16 2017 Conversation Permalink

Hi, there is a way to recover CAA. Send me an email (cgibson@au1.ibm.com) and I will share the details with you. Cheers, Chris.

PreciousGift commented Feb 17 2016 Comment Permalink

Procedure for using bitwise copy for creating new servers:
It used to be that we will do the following to create, say server2 from
server1:
nim mksysb server1
nim define machine object server2
nim standalone server installation to server2 using mksyb_server1
Add the default route to server2

We are now using the following process for the cloning:
on server1:
chdev -l sys0 -a ghostdev=2
create a cloned copy of the rootvg disks using some SAN function.
assign the new clone disks to a new LPAR
boots up the new LPAR
chdev -l inet0 -a hostname=server2 -a route=default_route_settings
chdev -l en0 -a netaddr=server2_ip_address -a netmask=server2_netmask
/usr/sbin/rsct/install/bin/recfgct
/usr/sbin/rsct/bin/rmcctrl -z
/usr/sbin/rsct/bin/rmcctrl -A
/usr/sbin/rsct/bin/rmcctrl -p

With the above process, I confirmed that for server2, its pvid is
changed and DLAR functions work on it.

I am still a little bit concerned that the above process may not be
equivalent to a mksysb backup and restore. Can you confirm it? Did I
miss anything?

ratsou commented Jan 9 2016 Conversation Permalink

Thank you for your article that helped me a lot on understanding the replication of rootvg.

To improve our RTO, I'm looking for a solution that would also create system requirements for the HMC LPAR and VIO LPAR

A solution such as RSS (SIMPLIFIED REMOTE RESTART) may be appropriate, but it seems to me that the FSP production should be reached, limiting the value of RSS in case of crash..

An idea ?

cggibbo commented Jan 19 2016 Conversation Permalink

Yes, you are correct. The FSP must be available for RR to work. You could write a script that sync's/creates the LPAR profiles at the DR site.

AbhilashMenacheril commented Sep 13 2015 Comment Permalink

Did anyone try this in a CAA environment ? I mean in a Power HA 7.1 sysmirror. I believe the CAA cluster will not work in both cases :
1. If ghotdev is set, the ODM is wiped which in turn will remove the CAA cluster too.
2. If ghostdev is not set, the CAA replicated disk will have a different logical device name and UDID, which will disturb the CAA cluster.
However, the nodeid has to be regenerated which can be done without much hassles.
Do we have a way out to get the Power HA replicated on the DR as it is ?

mikekelley1981 commented Feb 18 2015 Comment Permalink

Im impressed how they handle their support. Had several issues to fix, and they were glad to help me with my server ( http://www.spectra.com/hitachi/used-system/107/index.htm ) issue.

cggibbo commented Apr 8 2013 Comment Permalink

We generate new WWPNs at the DR site. Thanks for sharing James. Very interesting setup and I 'm glad its all working OK for you.

Jame5.H commented Apr 8 2013 Comment Permalink

So are the same WWPN's preserved for the Primary/DR partition? My biggest issue was creating the NPIV devices on the DR hardware from a replicated rootvg which generated new device WWPN's that had to be zoned seperately to all of the storage devices. I'm in the process of deploying our first AIX servers for our organisation for TSM v6.3 and essentially we're doing the same thing on AIX 7.1.... Redundant VIOS at each site with redundant SAN per VIOS, all data LUNS including rootvg are SAN attached using SVC vdisks, with syncronoush metro-mirroring to DR. Our test so far has included an AIX 7.1 client lpar running TSM v6.3.3.0 with replicated san attached VSCSI for rootvg, NPIV VFC attached data luns, NPIV VFC attached 3592 libraries (22 drives in total), plus a NPIV VFC attached Protectier (16 virtual LTO) TS7650G. Using SDDPCM and IBM atape control path / data path failover, I can shutdown the TSM server, break the replication and power on the DR partition and not lose any data in the DB2 db, or on a file storage pool that had not yet been migrated to tape and it all comes online at the DR site in 5 minutes without the need to restore a TSM DB or an AIX Operating system from backups. I also, tested changes to the client at the DR site and replicating back and it all works ok. Although I didin't know about ghostdev, but the rootvg is presented via the two VIOS as VSCSI (not NPIV VFC) and had no issues booting, I did notice that the hdisk count increased, but all the VG's sorted themselves out fine. Stale dev's came back online as the original device id's once I failed back to the prod site.

brian_s commented Sep 26 2012 Comment Permalink

Great post Chris. I hadn't heard about the ghostdev attribute before. It will definitly come in handy.

aldridged commented Sep 20 2012 Comment Permalink

Excellent article chris. Some good pointers and a new attribute to use

Blogs

Chris's AIX Blog

About this blog

Related posts

Db2 Recovery Expert ...

Holiday Readiness: A...

z/VSE service news: ...

z/VSE service news: ...

z15 System Recovery ...

Similar Ideas

Support FEB on IBMi

Tags

Selected Tags

Related Tags

Replicated rootvg, ghostdev and NPIV

Send Email Notification

Quarantine this entry

Mark as Duplicate

Comments (14)