Testing AIX Live Update Across Power Systems FramesThis entry marks the 10 year anniversary of my “AIX & PowerVM” blog. Little did I know, that 10 years from July 8th, 2009, I’d still be blogging about AIX (and PowerVM). Nor did I ever suspect I’d be talking about being able to update AIX systems live, without a reboot (still seems like “magic” to me!). Which brings me to the topic of today’s blog post.
After reading the article “AIX Live Update Across Frames using PowerVC in case of Insufficient System Resource Availability”, I was inspired to test it in my lab. I simply followed the instructions provided and was able to successfully live update an AIX system across two Power Systems servers. AIX Live Update Across Frames using PowerVC in case of Insufficient System Resource Availability
I found this
method to be a great way of performing live updates in environments
where you may be low on available CPU capacity on your Power Systems
servers. This may be preventing you from using live update altogether.
Having the capability to move your workload, live, to another system,
where you have enough spare (unallocated) capacity, may get you over the
line for AIX live update. There are other methods that can also help,
such as CPU reduction. I’ve written about this before, here, http
What follows are the steps I performed and some of the output I captured for future reference, from a test with live update across frames.
Note: My AIX virtual machine (VM) environment is managed by PowerVC (1.4.1.0). There are three (S822 Power Systems) Hosts, managed by the PowerVC management server. In this environment, the AIX VM typically resides on the host frame called D21. The temporary destination host frame is named D47. Before attempting the live update, we first made sure that Live Partition Mobility (LPM) worked for the AIX VM between the two Power Systems, D21 and D47. We used PowerVC to manually migrate the VM from D21 to D47 and back again. The image below is a screenshot of the Hosts managed in the PowerVC lab, with the source and destination hosts labelled accordingly.
1. First, I configured the live update data file. For the purposes of this test, I chose to "force" the migration to another frame. I deliberately selected the D47 host as the temporary destination host.
root@orion / # cd /var root@orion / # cat lvupdate.data general:
pvc: management_console = cgpvc user = root destination = D47 force_migration = yes
The
force_migration attribute forces Live Update to occur on another host
even if sufficient resources are available on the current host. The
force_migration attribute can be either “yes”, “no”, or missing/empty.
The missing/empty value is treated as “no”. Any other value is invalid.
The /var
# destination = < PowerVC host name | ANY > This attribute is used to # specify the host on which the Live Update operation will be executed # if the resources are insufficient on the local host, or if the # force_migration attribute is specified with a value equals to yes. # If the attribute is set to ANY, the destination will be selected # according to the placement policy of the host group in PowerVC. # If this parameter or its value is not specified, the Live Update # operation is executed locally. # force_migration = < yes | no > When set to yes, the Live Update # operation is executed on the host specified by the destination # attribute even if the resources are sufficient locally. # This parameter is optional and can be set to yes only when a # destination is specified.
2. Next I authenticated with my PowerVC server and performed a live update preview operation.
root@orion / # pvcauth -u pvcadmin -p abc1234 -a cgpvc root@orion / # pvcauth -l Address : 10.1.5.2 User name: pvcadmin Project : ibm-default Port : 5000 TTL : 5:59:58
root@orion / # geninstall -kp Validating live update input data.
Computing the estimated time for the live update operation: ---- LPAR: orion Blackout time(in seconds): 17 Total operation time(in seconds): 1438
Checking mirror vg device size: ---- Required device size: 48384 MB PASSED: The disks specified for the mirrored rootvg resulted in a valid volume group factor.
Checking new root vg device size: ---- Required device size: 48384 MB
Checking temporary storage size for original LPAR: ---- Required device size: 0 or Undetermined MB
Checking temporary storage size for surrogate LPAR: ---- Required device size: 0 or Undetermined MB
Checking lpar minimal memory size: ---- Required memory size: 2048 MB
Checking other requirements: ---- PASSED: sufficient space available in /var. PASSED: sufficient space available in /. PASSED: no existing altinst_rvgLvup. PASSED: rootvg is not part of a snapshot. PASSED: pkcs11 is not installed. PASSED: DoD/DoDv2 profile is not applied. PASSED: Advanced Accounting is not on. PASSED: Virtual Trusted Platform Module is not on. PASSED: multiple semid lists is not on. PASSED: sufficient file system space for interim fix(es) is available. PASSED: The trustchk Trusted Execution Policy is not on. PASSED: The trustchk Trusted Library Policy is not on. PASSED: The trustchk TSD_FILES_LOCK policy is not on. PASSED: the boot disk is set to the current rootvg. PASSED: the mirrorvg name is available. PASSED: the rootvg is uniformly mirrored. PASSED: the rootvg does not have the maximum number of mirror copies. PASSED: the rootvg does not have stale logical volumes. PASSED: all of the mounted file systems are of a supported type. PASSED: this AIX instance is not diskless. PASSED: no Kerberos configured for NFS mounts. PASSED: multibos environment not present. PASSED: Trusted Computing Base not defined. PASSED: no local tape devices found. PASSED: live update not executed from console. PASSED: the execution environment is valid. PASSED: enough available space for /var to dump Component Trace buffers. PASSED: enough available space for /var to dump Light weight memory Trace buffers. PASSED: all devices are virtual devices. PASSED: No active workload partition found. PASSED: nfs configuration supported. PASSED: RSCT daemons are active. PASSED: no Kerberos configuration. PASSED: no virtual log device configured. PASSED: PowerVC token is valid. PASSED: PowerVC is at a supported level. PASSED: User has PowerVC permissions for Live Update processing. PASSED: Host is not in maintenance mode. PASSED: PowerVC token expiration date is valid. PASSED: PowerVC network devices match those present on partition. PASSED: PowerVC volumes match hdisks present on partition. PASSED: All rootvg volumes are boot volumes in PowerVC. PASSED: Enough free space on storage provider. PASSED: Sufficient processing units available on target host. PASSED: Sufficient memory available on target host. PASSED: Capacity check on SRIOV ports of target host. PASSED: Original and destination hosts are in the same PowerVC host group. PASSED: PowerVM Enterprise Edition is activated on both original and destination hosts. PASSED: Destination host supports the processor compatibility mode of the LPAR. PASSED: Original and destination hosts have the same logical-memory block (LMB) size. PASSED: Destination host fulfills the network placement requirements. PASSED: Destination host fulfills the storage placement requirements. PASSED: Destination host satisfies the collocation rules. PASSED: PowerVC virtual machine health status is 'OK'. PASSED: the disk configuration is supported. PASSED: no Generic Routing Encapsulation (GRE) tunnel configured. PASSED: Firmware level is supported. PASSED: Consolidated system trace buffers size is within the limit of 64 MB. PASSED: SMT number is valid. PASSED: No process attached to vty0. PASSED: No active ipsec configuration found. PASSED: Audit is not enabled in stream mode. PASSED: No exclusive rsets (sysxrset) found. INFO: Any system dumps present in the current dump logical volumes will not be available after live update is complete. INFO: Temporary migration server: D47.
I noticed that the output from the geninstall command looked very similar to a normal/standard live update operation, with the exception of the output shown below. This output is specifically related to checking the temporary destination host (in the PowerVC managed environment) for sufficient capacity, appropriate PowerVM capabilities, logical memory block sizes and other compatibility checks. And the last line, INFO:, simply indicates the name of the temporary destination frame, as specified in my lvupdate.data config file. … PASSED: Sufficient processing units available on target host. PASSED: Sufficient memory available on target host. PASSED: Capacity check on SRIOV ports of target host. PASSED: Original and destination hosts are in the same PowerVC host group. PASSED: PowerVM Enterprise Edition is activated on both original and destination hosts. PASSED: Destination host supports the processor compatibility mode of the LPAR. PASSED: Original and destination hosts have the same logical-memory block (LMB) size. PASSED: Destination host fulfills the network placement requirements. PASSED: Destination host fulfills the storage placement requirements. PASSED: Destination host satisfies the collocation rules. ... INFO: Temporary migration server: D47.
3. I performed the live update operation and monitored the status from the command line on the AIX system and from the PowerVC UI.
root@orion / # geninstall -k Validating live update input data.
Computing the estimated time for the live update operation: ---- LPAR: orion Blackout time(in seconds): 17 Total operation time(in seconds): 1466
Checking mirror vg device size: ---- Required device size: 48384 MB PASSED: The disks specified for the mirrored rootvg resulted in a valid volume group factor.
Checking new root vg device size: ---- Required device size: 48384 MB
Checking temporary storage size for original LPAR: ---- Required device size: 0 or Undetermined MB
Checking temporary storage size for surrogate LPAR: ---- Required device size: 0 or Undetermined MB
Checking lpar minimal memory size: ---- Required memory size: 2048 MB
Checking other requirements: ---- PASSED: sufficient space available in /var. PASSED: sufficient space available in /. PASSED: no existing altinst_rvgLvup. PASSED: rootvg is not part of a snapshot. PASSED: pkcs11 is not installed. PASSED: DoD/DoDv2 profile is not applied. PASSED: Advanced Accounting is not on. PASSED: Virtual Trusted Platform Module is not on. PASSED: multiple semid lists is not on. PASSED: sufficient file system space for interim fix(es) is available. PASSED: The trustchk Trusted Execution Policy is not on. PASSED: The trustchk Trusted Library Policy is not on. PASSED: The trustchk TSD_FILES_LOCK policy is not on. PASSED: the boot disk is set to the current rootvg. PASSED: the mirrorvg name is available. PASSED: the rootvg is uniformly mirrored. PASSED: the rootvg does not have the maximum number of mirror copies. PASSED: the rootvg does not have stale logical volumes. PASSED: all of the mounted file systems are of a supported type. PASSED: this AIX instance is not diskless. PASSED: no Kerberos configured for NFS mounts. PASSED: multibos environment not present. PASSED: Trusted Computing Base not defined. PASSED: no local tape devices found. PASSED: live update not executed from console. PASSED: the execution environment is valid. PASSED: enough available space for /var to dump Component Trace buffers. PASSED: enough available space for /var to dump Light weight memory Trace buffers. PASSED: all devices are virtual devices. PASSED: No active workload partition found. PASSED: nfs configuration supported. PASSED: RSCT daemons are active. PASSED: no Kerberos configuration. PASSED: no virtual log device configured. PASSED: PowerVC token is valid. PASSED: PowerVC is at a supported level. PASSED: User has PowerVC permissions for Live Update processing. PASSED: Host is not in maintenance mode. PASSED: PowerVC token expiration date is valid. PASSED: PowerVC network devices match those present on partition. PASSED: PowerVC volumes match hdisks present on partition. PASSED: All rootvg volumes are boot volumes in PowerVC. PASSED: Enough free space on storage provider. PASSED: Sufficient processing units available on target host. PASSED: Sufficient memory available on target host. PASSED: Capacity check on SRIOV ports of target host. PASSED: Original and destination hosts are in the same PowerVC host group. PASSED: PowerVM Enterprise Edition is activated on both original and destination hosts. PASSED: Destination host supports the processor compatibility mode of the LPAR. PASSED: Original and destination hosts have the same logical-memory block (LMB) size. PASSED: Destination host fulfills the network placement requirements. PASSED: Destination host fulfills the storage placement requirements. PASSED: Destination host satisfies the collocation rules. PASSED: PowerVC virtual machine health status is 'OK'. PASSED: the disk configuration is supported. PASSED: no Generic Routing Encapsulation (GRE) tunnel configured. PASSED: Firmware level is supported. PASSED: Consolidated system trace buffers size is within the limit of 64 MB. PASSED: SMT number is valid. PASSED: No process attached to vty0. PASSED: No active ipsec configuration found. PASSED: Audit is not enabled in stream mode. PASSED: No exclusive rsets (sysxrset) found. INFO: Any system dumps present in the current dump logical volumes will not be available after live update is complete. INFO: Temporary migration server: D47.
Non-interruptable live update operation begins in 10 seconds.
Broadcast message from root@orion (pts/0) at 15:36:29 ...
Live AIX update in progress.
Non-interruptable live update operation begins in 10 seconds.
Broadcast message from root@orion (pts/0) at 15:36:29 ...
Live AIX update in progress.
Initializing live update on original LPAR.
Validating original LPAR environment.
Beginning live update operation on original LPAR.
Migrating to temporary destination. .... Requesting resources required for live update. .... Notifying applications of impending live update. .... Creating rootvg for boot of surrogate. .... Starting the surrogate LPAR. .... Creating mirror of original LPAR's rootvg. .... Moving workload to surrogate LPAR. ....
Blackout Time started.
Blackout Time end.
Workload is running on surrogate LPAR. .... Shutting down the Original LPAR. ....
Migrating back to initial host. .... The live update operation succeeded.
Broadcast message from root@orion (pts/0) at 16:11:15 ...
Live AIX update completed.
During the live update, I noticed the VM entered a Migrating state, as shown in the PowerVC GUI. The VM was migrating away from the (S822) D21 host to the (S822) D47 host.
Once the migration (LPM) was completed, the VM was now housed on the D47 host, as shown in the PowerVC GUI.
With the LPM complete, the live update process began. The surrogate VM was created, as expected, also on the temporary destination frame, D47.
When the live update operation was completed, the surrogate VM was all that remained (as expected).
The VM then live migrated back to the original (source) host (D21)
VM is back on D21 now.
The PowerVC GUI reported each of the migration steps and live update resource configuration processes in the PowerVC Messages view screen.
4. The AIX error report clearly showed the successful start and completion of the live update process, as well as the live partition mobility operations (to and from the temporary host).
root@orion / # errpt IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION 12295E0B 0611161119 I S LVUPDATE Live AIX update completed successfully <<< Live update finished successfully. A5E6DB96 0611160819 I S pmig Client Partition Migration Completed <<< After live update. Migration back to original host finished 08917DC6 0611160819 I S pmig Client Partition Migration Started <<< After live update. Migration back to original host started 9DBCFDEE 0611155519 T O errdemon ERROR LOGGING TURNED ON A5E6DB96 0611153819 I S pmig Client Partition Migration Completed <<< Before live update. Migration to destination host finished 08917DC6 0611153719 I S pmig Client Partition Migration Started <<< Before live update. Migration from source host to destination host started 9A74C7AB 0611153619 I S LVUPDATE Live AIX update started <<< Live update started
I’m very
impressed by this innovative solution and I hope AIX administrators will
take advantage of this new capability when patching their critical AIX
systems in PowerVC managed cloud environments. If you want to learn more
about simplifying AIX Live Update with PowerVC, please refer to my 2017
articles on the subject, here http
I started this blog whilst I was an IBM customer, working as an AIX administrator for a large pseudo-government organisation in Melbourne, Australia. I’m still in Melbourne, but for the last 6 years I’ve been working for IBM STG, helping customers design and implement enterprise class Power Systems running AIX and PowerVM. The aim of this blog was to 1) help me remember what I’d done(!), 2) hopefully help others that might also be trying to do the same things I was trying to do and 3) share my learnings (including my mistakes) with other AIX customers, all over. That has been and will forever be the purpose of my blog. Thank you to all those people that have taken the time to read my blog over the last decade. I appreciate it. And for all those that have sent me feedback or questions, thank you for engaging with me and helping me to learn as well. I look forward to sharing my AIX/PowerVM experiences with everyone for some time to come.
|