Simplifying AIX Live Update with PowerVCStarting with AIX Version 7.2, the AIX operating system provides the Live Update function which eliminates downtime required when patching the AIX operating system kernel. Your application workloads are not stopped, yet they can take advantage of the new fixes immediately after the Live Update operation, without a reboot. Administrators can use the Live Update feature to install AIX interim fixes, service packs and technology levels.
In a nutshell, Live Update allows AIX administrators to install AIX updates in-place, but instead of reboot, we clone the root volume group (rootvg), boot a surrogate LPAR with the updated kernel, live migrate the processes to the new surrogate LPAR and then remove the original LPAR. If you’re not familiar with the Live Update (LU) concepts or would like to learn more about how it works, please read my blog on this topic. I encourage you to read this (and other related material on Live Update) before you attempt to implement LU.
How to live update your AIX system without rebooting the server
If you’ve used LU before, you’ll know that there is a requirement for several spare disks (at a minimum two disks) to be available, on the original partition. This usually requires the AIX admin to perform several administrative storage steps prior to the operation. With AIX 7.2 TL2 we now can simplify this process by utilising IBM PowerVC Virtualization Center (version 1.3.3.1).
PowerVC is an advanced virtualization and cloud management offering, built on OpenStack, that provides simplified virtualization management and cloud deployments for IBM AIX (and IBM i and Linux virtual machines) running on IBM Power Systems.
Leveraging this enhancement means that Live Update will use PowerVC, instead of the Hardware Management Console (HMC), to create the surrogate LPAR. It also means that the storage, required by Live Update, will be provisioned, automatically (without administrator intervention) by PowerVC. Also, using PowerVC, it is no longer necessary to pre-configure the lvupdate.data configuration file (still required in non-PowerVC managed environments).
This new capability works with both HMC and NovaLink managed PowerVC Cloud environments, as shown in the following diagram.
For a complete list of the supported and required versions to support AIX Live Update in a PowerVC managed environment, please refer to the following link:
http
I’ll now demonstrate how to use this new capability. Before I start, first I confirm that my AIX and PowerVC systems are at the correct versions to support this operation.
PowerVC Management Server [root@cgpvc ~]# cat /opt [1.3.3.0 Install] name = IBM PowerVC version = 1.3.3.0 build = 20170612-1428 install-date = 2017-09-05 offering = standard cloud_enabled = yes
[1.3.3.1 Install] name = IBM PowerVC version = 1.3.3.1 build = 20170901-1542 install-date = 2017-09-13
AIX LPAR root@orion / # oslevel -s 7200-02-01-1732
Please note, the AIX LPAR (named orion), is already managed by PowerVC. The LPAR must already be managed by PowerVC for the Live Update operation to succeed. I can view the LPAR details from the PowerVC UI.
The LPAR is configured with a single disk only (for rootvg), as shown in the lspv output from inside the LPAR.
root@orion / # lspv hdisk0 00f9
I can also view the disk from the “Attached Volumes” tab under the VM details in PowerVC.
The lvupdate.data file has not been configured and does not need to exist prior to starting the LU operation.
root@orion / # ls -tlr /var ls: 0653-341 The file /var
To use this new capability, you first employ the new pvcauth tool from your AIX system to authenticate with the PowerVC management server. This command is used to obtain a token, required to use the PowerVC services for Live Update. In the example below, we have authenticated with a PowerVC server named cgpvc, using the PowerVC user called pvcadmin.
root@orion / # pvcauth -u pvcadmin -p abc1234 -a cgpvc root@orion / # pvcauth -l Address : 10.1.50.232 User name: pvcadmin Project : ibm-default Port : 5000 TTL : 5:59:58
Note: Before authenticating with PowerVC, the "Administrator" role should be assigned to the PowerVC user, otherwise the following message will be reported by Live Update: "1430-175 FAILED: User does not have PowerVC permissions (admin role) for Live Update processing”. You can review and change assigned roles from the PowerVC User Interface, as shown in the figure below.
http
With the token in place, next we perform a Live Update preview operation with geninstall –k –p. Note that the output below has been shortened for brevity. You’ll notice that, along with the usual checks, there are several new tests performed to ensure the PowerVC environment is ready for Live Update.
root@orion / # geninstall -k -p Validating live update input data.
Computing the estimated time for the live update operation: ---- LPAR: orion Blackout time(in seconds): 12 Total operation time(in seconds): 1559 ... Checking lpar minimal memory size: ---- Required memory size: 2048 MB ... Checking other requirements: ---- ... PASSED: PowerVC token is valid. PASSED: PowerVC is at a supported level. PASSED: User has PowerVC permissions for Live Update processing. PASSED: Host is not in maintenance mode. PASSED: PowerVC token expiration date is valid. PASSED: PowerVC network devices match those present on partition. PASSED: PowerVC volumes match hdisks present on partition. ... INFO: Any system dumps present in the current dump logical
Note, if the LPAR is not currently managed by PowerVC, you would see the following failure during the geninstall preview.
1430-190 FAILED: Live Update initialization of PowerVC failed.
The preview was successful. We can initiate the actual LU process now, using geninstall –k. Note, the output is identical to that of a “non-PowerVC enabled” Live Update operation. Again, the output has been shortened for brevity. We are not applying any AIX updates here, instead we are simply performing the Live Update operation, which will provision a new LPAR and migrate the workload, without updating AIX. This is a great way to test Live Update.
root@orion / # geninstall –k … Non-interruptable live update operation begins in 10 seconds.
Broadcast message from root@orion (pts/0) at 02:22:09 ...
Live AIX update in progress.
Initializing live update on original LPAR.
Validating original LPAR environment.
Beginning live update operation on original LPAR.
Requesting resources required for live update. .... Notifying applications of impending live update. .... Creating rootvg for boot of surrogate. .... Starting the surrogate LPAR. .... Creating mirror of original LPAR's rootvg. .... Moving workload to surrogate LPAR. ............ Blackout Time started.
Blackout Time end.
Workload is running on surrogate LPAR. .... Shutting down the Original LPAR. .... The live update operation succeeded.
Broadcast message from root@orion (pts/0) at 02:47:20 ...
Live AIX update completed.
Monitoring the Live Update process, from the PowerVC UI, we observe several events and actions taking place. For example, we notice the automatic creation and allocation of new disks. These disks are used by Live Update for creating and booting the surrogate partition. You’ll see something similar to the following, in the UI console.
New disk automatically assigned to original LPAR for creation of surrogate boot disk
New disk automatically assigned for mirror of rootvg
PowerVC Messages indicating the creation & attachment of new volumes for Live Update
Before the Live Update process starts, we find only one Virtual Machine (VM), named orion (the original LPAR). But once the process starts, we eventually see another VM with the same name but different IP address. This is the surrogate LPAR, which will ultimately house the migrated workload from the original LPAR. The IP address, for the surrogate, is automatically assigned from the PowerVC IP pool. Once the operation completes we are left with only the new surrogate LPAR, which is re-assigned the same IP address as the original LPAR. We also observe that the VMs have different instance IDs in PowerVC.
Original LPAR (IP address 10.1.50.191)
Original LPAR details (existing instance ID)
PowerVC deploy of surrogate VM
Surrogate LPAR deployed by PowerVC – in state of “Building”.
Surrogate LPAR deployed with new IP address 10.1.50.192, from PowerVC IP Pool.
Surrogate LPAR after workload migrated to it. Original IP assigned to surrogate VM.
Surrogate LPAR details (new instance ID)
Removal of original LPAR and cleanup of resources after Live Update
If you wish to remove the Live Update surrogate boot disk (lvup_rootvg, after a reboot), you can use the clvupdate command. This command will contact the PowerVC management server (requires a pvcauth token) and automatically remove the volume group and disk from the LPAR, as shown below. ; reboot LPAR root@orion / # lspv root@orion / # pvcauth -u pvcadmin -p heathrow -a cgpvc root@orion / # clvupdate -v root@orion / # lspv
A message will appear in the PowerVC UI, indicating that the volume has been detached and deleted from the VM.
I should also note that you can monitor the status of the PowerVC VMs and storage volumes using the OpenStack nova and cinder commands from the PowerVC management server. For example:
List VMs
[root@cgpvc ~]# nova list | grep -i orion | a1c9 | a1c9
List Volumes
[root@cgpvc ~]# cinder --service-type volume list | grep orion | 4bd8
Whilst Live Update might still be new to many AIX administrators, it promises to be a much used and lauded feature. The integration with PowerVC will only help to further simplify and automate an already innovative process. For customers that have already started managing their private clouds using IBM PowerVC, this is another welcome extension to automating their AIX cloud environments. |
Great article Chris. Do you know if there is a way of selecting the surrogate LPAR IP or does it have to just 'pick from pool'?
Thanks Stu. AFAIK there's no way to specify an IP address, at the moment. PowerVC just picks from the IP pool.
Great article Chris - additionally - PowerVC 1.4.0 (https://developer.ibm.com/powervc/2017/10/10/announcing-powervc-1-4-0/) allows live capture of systems, so you can patch/update a SOE system Live, then Capture it without shutdown to be deployed as the new GOLD image (if that is the deployment cycle that you use on your site)
Thanks Dom. You can enable live capture using "# powervc-config compute live-capture --enable --host MYHOST --restart".
Couple
questions, what is the minimum PowerVC role required for pvcauth user
authentication, I'm hoping not Administrator. Can authentication occur
via the NovaLink partition over the MGMTSWITCH, or dose the VM need
comms to the PowerVC server?
https://www.ibm.com/support/knowledgecenter/ssw_aix_72/com.ibm.aix.cmds4/pvcauth.htm
"You can use the pvcauth command......if you have appropriate PowerVC administrative authority."
"Can
authentication occur via the NovaLink partition over the
MGMTSWITCH....?". No. The PowerVC mgmt server might be on a different
machine to that of the Novalink partition. This would require network
connectivity between the two systems.
Okay so it seems VM Manager + Storage Manager will do the trick.
That's unfortunate, routable comms to the PowerVC server from the VM would add challenges to a IaaS cloud offering.
I
believe PowerVC "Administrator" role is required, otherwise you'll
receive this message: "1430-175 FAILED: User does not have PowerVC
permissions (admin role) for Live Update processing."
You make an excellent point re: IaaS. Please submit an RFE for PowerVC to cater this requirement. Thanks. https://www.ibm.com/developerworks/rfe/
@cggibbo
Chris thanks a lot! According to your experience: Could we say that AIX
live update is stable enough, that customers can use it unhesitating in
a production environment?
Yes. Of course, customers should thoroughly test Live Update in a non-production BEFORE implementing in production.