Updating your PowerHA cluster with cl_ezupdate (Chris's AIX Blog)

The developerWorks Connections platform will be sunset on December 31, 2019. On January 1, 2020, this blog will no longer be available. More details available on our FAQ.

Updating your PowerHA cluster with cl_ezupdate

cggibbo Jan 18 2018 Comment (1) Visits (9763)

0 people like this

PowerHA 7.2.1 (June 2017) introduced the cl_ezupdate utility. You can use the cl_ezupdate command to update the software for the entire cluster or a subset of nodes in the cluster, often without disrupting workloads.

https://www.ibm.com/support/knowledgecenter/en/SSPHQG_7.2.2/com.ibm.powerha.cmds/hacmpcmds_whatsnew.htm

You can use this tool to apply and reject updates for PowerHA service packs and technology levels and interim fixes. This process is performed on the entire cluster or on a specific subset of nodes in the cluster. You can also apply updates in preview mode. When you use preview mode, all the prerequisites for installation process are checked, but the cluster updates are not installed on the system.

The tool with also allow you to apply and reject updates for AIX service packs or interim fixes. But you cannot use it to update the cluster to newer AIX technology levels.

Make sure you read the "Limitations" section of the knowledge centre to understand the current limitations and restrictions for the cl_ezupdate command.

Upgrade PowerHA SystemMirror using the cl_ezupdate command

https://www.ibm.com/support/knowledgecenter/SSPHQG_7.2.1/com.ibm.powerha.insgd/ha_install_ugrade_clezupdate.htm

cl_ezupdate command

https://www.ibm.com/support/knowledgecenter/SSPHQG_7.2.2/com.ibm.powerha.cmds/cl_ezupdate.htm

cl_ezupdate Command Usage Information

Purpose:

Designed to manage the entire PowerHA cluster update

without interrupting the application activity.

Usage:

cl_ezupdate [-v] -h

cl_ezupdate [-v] -Q {cluster|node|nim} [-N <node1,node2,...>]

cl_ezupdate [-v] {-Q {lpp|all} |-P|-A|-C|-R} [-N <node1,node2,...>] -s <repository> [-F]

Description:

Query informations about cluster state and available updates,

install updates in preview mode or apply, commit or reject updates.

Flags:

-h Displays the help for this program.

-v Sets the verbose mode of help.

-s Specifies the update source.

It could be a directory, then it should start with "/" character.

It could be a LPP source if the update is done through NIM server.

-N Specify the node names. By default the scope is the entire cluster.

-Q Query cluster status and or available updates.

The scope of query request is: cluster|node|nim|lpp|all.

-P Do not install any update, just try to install it in preview mode.

-A Apply the updates located on the repository.

-C Commit the latest update version.

-R Roll back to the previous version.

-F Force mode. only combined with -A option.

Should be use if the install of SP is not possible because an Ifix

is locking an installable file set.

Output file:

/var/hacmp/EZUpdate/EZUpdate.log

Examples:

To display informations about NIM servers type:

cl_ezupdate -Q nim

To check and display contents of lpp source, type:

cl_ezupdate -Q lpp -s /tmp/lppsource/inst.images

To install in apply mode lpp located on NIM server, type:

cl_ezupdate -A -s HA_v720_lpp

------------

In the following example, I will update my two node PowerHA cluster from PowerHA 7.2.1 SP1 to SP2. Both nodes are running AIX 7.1 TL4 SP4.

PowerHA fix information For PowerHA 7.2.1 Service Pack 2

https://delivery04.dhe.ibm.com/sar/CMA/OSA/07af3/2/ha721sp2.fixinfo.html

I'll use an LPP source on my NIM master to provide the updates to the cl_ezupdate command.

The LPP source, on the NIM master, is named pha721sp2.

# lsnim -t lpp_source | grep pha72

pha721sp2 resources lpp_source

# lsnim -l pha721sp2

pha721sp2:

class = resources

type = lpp_source

arch = power

Rstate = ready for use

prev_state = unavailable for use

location = /export/mksysb/cg/pha721sp2

alloc_count = 0

server = master

Both PowerHA nodes have already been configured as NIM clients to the NIM master. I can run nimclient -l, from each node, to confirm they’re able to communicate with the master.

[root@lpar1]/# nimclient -l

master machines master

boot resources boot

nim_script resources nim_script

10_1_50 networks ent

710lpp_res resources lpp_source

AIX71TL4SP4 resources lpp_source

hsc02 management hmc

750_stg management cec

....

Use the clcmd command to confirm the HA and AIX levels on both nodes in the cluster.

[root@lpar1]/# clcmd oslevel -s

-------------------------------

NODE lpar9.meldemo.au.ibm.com

-------------------------------

7100-04-04-1717

-------------------------------

NODE lpar1.meldemo.au.ibm.com

-------------------------------

7100-04-04-1717

[root@lpar1]/# clcmd halevel -s

-------------------------------

NODE lpar9.meldemo.au.ibm.com

-------------------------------

7.2.1 SP1

-------------------------------

NODE lpar1.meldemo.au.ibm.com

-------------------------------

7.2.1 SP1

From one of the of the nodes, I can run cl_ezupdate –Q to query the NIM master LPP source resources. I'm looking for pha721sp2.

[root@lpar1]/# cl_ezupdate -Q nim

Checking for root authority...

Running as root.

Checking for AIX level...

The installed AIX version is supported.

Checking for PowerHA SystemMirror version.

The installed PowerHA SystemMirror version is supported.

Checking for clcomd communication on all nodes...

clcomd on each node can both send and receive messages.

INFO: The cluster: STGMELB is in state: STABLE

INFO: The node: lpar1 is in state: NORMAL

INFO: The node: lpar9 is in state: NORMAL

Checking for NIM servers...

Available lpp_source on NIM server: 750lpar4 from node: lpar9 :

710lpp_res resources lpp_source

AIX71TL4SP4 resources lpp_source

powerHA_710_base resources lpp_source

powerHA712 resources lpp_source

aix72tl1_sp2_only resources lpp_source

cglppsrc resources lpp_source

tncpm_7200-01-03_lpp resources lpp_source

xxxlpp resources lpp_source

pha721sp2 resources lpp_source <<<

java8 resources lpp_source

AIX53TL12SP9_P8 resources lpp_source

AIX72TL2SP1 resources lpp_source

Available lpp_source on NIM server: 750lpar4 from node: lpar1 :

710lpp_res resources lpp_source

AIX71TL4SP4 resources lpp_source

powerHA_710_base resources lpp_source

powerHA712 resources lpp_source

aix72tl1_sp2_only resources lpp_source

cglppsrc resources lpp_source

tncpm_7200-01-03_lpp resources lpp_source

xxxlpp resources lpp_source

pha721sp2 resources lpp_source <<<

java8 resources lpp_source

AIX53TL12SP9_P8 resources lpp_source

AIX72TL2SP1 resources lpp_source

[root@lpar1]/#

Preview the update/install first. Just to be sure there’s no issue with the filesets in the NIM resource.

[root@lpar1]/# cl_ezupdate -P -v -s pha721sp2

Checking for root authority...

Running as root.

Checking for AIX level...

The installed AIX version is supported.

Checking for PowerHA SystemMirror version.

The installed PowerHA SystemMirror version is supported.

Checking for clcomd communication on all nodes...

clcomd on each node can both send and receive messages.

INFO: The cluster: STGMELB is in state: STABLE

INFO: The node: lpar1 is in state: NORMAL

INFO: The node: lpar9 is in state: NORMAL

Checking for NIM servers...

Checking for lpps and Ifixes from source: pha721sp2...

Build lists of filesets that can be apply reject or commit on node lpar9

Fileset list to apply on node lpar9: cluster.es.assist.db2 cluster.es.assist.sap cluster.es.assist.websphere cluster.es.client.lib cluster.es.cspoc.cmds cluster.es.cspoc.rte cluster.es.server.diag cluster.es.server.events cluster.es.server.rte cluster.es.server.utils cluster.es.smui.agent cluster.es.smui.common

There is nothing to commit or reject on node: lpar9 from source: pha721sp2

Build lists of filesets that can be apply reject or commit on node lpar1

Fileset list to apply on node lpar1: cluster.es.assist.db2 cluster.es.assist.sap cluster.es.assist.websphere cluster.es.client.lib cluster.es.cspoc.cmds cluster.es.cspoc.rte cluster.es.server.diag cluster.es.server.events cluster.es.server.rte cluster.es.server.utils cluster.es.smui.agent cluster.es.smui.common

There is nothing to commit or reject on node: lpar1 from source: pha721sp2

Installing fileset updates in preview mode on node: lpar9...

Succeeded to install preview updates on node: lpar9.

Installing fileset updates in preview mode on node: lpar1...

Succeeded to install preview updates on node: lpar1.

To apply SP2, to both nodes in the cluster, I run cl_ezupdate –A. The process updates the second first, as it does not currently house the HA resource group. The node joins the cluster again, after the updates have been applied. Once the second node is updated successfully, the first node is updated next. Note that the cluster resources remain available. The cluster is UNMANAGED during the update, so there’s no disruption to the HA applications.

Applying Updates

[root@lpar1]/# cl_ezupdate -A -v -s pha721sp2

Checking for root authority...

Running as root.

Checking for AIX level...

The installed AIX version is supported.

Checking for PowerHA SystemMirror version.

The installed PowerHA SystemMirror version is supported.

Checking for clcomd communication on all nodes...

clcomd on each node can both send and receive messages.

INFO: The cluster: STGMELB is in state: STABLE

INFO: The node: lpar1 is in state: NORMAL

INFO: The node: lpar9 is in state: NORMAL

Checking for NIM servers...

Checking for lpps and Ifixes from source: pha721sp2...

Build lists of filesets that can be apply reject or commit on node lpar9

There is nothing to commit or reject on node: lpar9 from source: pha721sp2

Build lists of filesets that can be apply reject or commit on node lpar1

There is nothing to commit or reject on node: lpar1 from source: pha721sp2

Installing fileset updates in preview mode on node: lpar9...

Succeeded to install preview updates on node: lpar9.

Installing fileset updates in preview mode on node: lpar1...

Succeeded to install preview updates on node: lpar1.

Stopping the node lpar9...

Stopping PowerHA cluster services on node: lpar9 in offline mode...

lpar9: 0513-044 The clinfoES Subsystem was requested to stop.

lpar9: 0513-044 The clevmgrdES Subsystem was requested to stop.

"lpar9" is now offline.

lpar9: Jan 17 2018 14:39:05/usr/es/sbin/cluster/utilities/clstop: called with flags -N -g

Applying updates on node: lpar9...

Succeeded to apply updates on node: lpar9.

Starting the node: lpar9...

Starting cluster manager daemon: clstrmgrES...

Starting PowerHA cluster services on node: lpar9 in manual mode...

Verifying cluster configuration prior to starting cluster services

WARNING: Cluster verification detected that some cluster components are

inactive. Please use the matrix below to verify the status of

inactive components:Node: lpar9 State: DOWN

WARNING: No backup repository disk is UP and not already part of a VG for nodes :

lpar9: start_cluster: Starting PowerHA SystemMirror

lpar9: 3604622 - 0:00 syslogd

lpar9: Setting routerevalidate to 1

lpar9: 0513-059 The clevmgrdES Subsystem has been started. Subsystem PID is 13959390.

...

"lpar9" is now online.

Starting Cluster Services on node: lpar9

This may take a few minutes. Please wait...

lpar9: Jan 17 2018 14:44:36Starting execution of /usr/es/sbin/cluster/etc/rc.cluster

lpar9: with parameters: -boot -N -b -P cl_rc_cluster -A

lpar9:

lpar9: Jan 17 2018 14:44:36usage: cl_echo messageid (default) messageJan 17 2018 14:44:36usage: cl_echo messageid (default) messageJan 17 2018 14:44:37

lpar9: /usr/es/sbin/cluster/utilities/clstart: called with flags -m -G -b -P cl_rc_cluster -B -A

lpar9:

lpar9: Jan 17 2018 14:44:39

lpar9: Completed execution of /usr/es/sbin/cluster/etc/rc.cluster

lpar9: with parameters: -boot -N -b -P cl_rc_cluster -A.

lpar9: Exit status = 0

lpar9:

Stopping the node lpar1...

Stopping PowerHA cluster services on node: lpar1 in unmanage mode...

Broadcast message from root@lpar1 (tty) at 14:45:04 ...

PowerHA SystemMirror on lpar1 shutting down. Please exit any cluster applications...

lpar1: 0513-044 The clevmgrdES Subsystem was requested to stop.

"lpar1" is now unmanaged.

lpar1: Jan 17 2018 14:45:04/usr/es/sbin/cluster/utilities/clstop: called with flags -N -f

Applying updates on node: lpar1...

Succeeded to apply updates on node: lpar1.

Starting the node: lpar1...

Starting cluster manager daemon: clstrmgrES...

Starting PowerHA cluster services on node: lpar1 in auto mode...

Verifying cluster configuration prior to starting cluster services

Verifying Cluster Configuration Prior to Starting Cluster Services.

Verifying node(s): lpar1 against the running node lpar9

WARNING: No backup repository disk is UP and not already part of a VG for nodes :

Successfully verified node(s): lpar1

lpar1: start_cluster: Starting PowerHA SystemMirror

.......

"lpar1" is now online.

Starting Cluster Services on node: lpar1

This may take a few minutes. Please wait...

lpar1: Jan 17 2018 14:51:49Starting execution of /usr/es/sbin/cluster/etc/rc.cluster

lpar1: with parameters: -boot -N -b -P cl_rc_cluster -A

lpar1:

lpar1: Jan 17 2018 14:51:49usage: cl_echo messageid (default) messageJan 17 2018 14:51:49usage: cl_echo messageid (default) messageRETURN_CODE=0

[root@lpar1]/#

Both nodes are now running PowerHa 7.2.1 SP2.

[root@lpar1]/# clcmd halevel -s

-------------------------------

NODE lpar9.meldemo.au.ibm.com

-------------------------------

7.2.1 SP2

-------------------------------

NODE lpar1.meldemo.au.ibm.com

-------------------------------

7.2.1 SP2

During the update, the cluster state is UNMANAGED, as shown below in the cldump output.

[root@lpar1]/var/hacmp/log# cldump

Obtaining information via SNMP from Node: lpar1...

_____________________________________________________________________________

Cluster Name: STGMELB

Cluster State: UP

Cluster Substate: STABLE

_____________________________________________________________________________

Node Name: lpar1 State: UP

Network Name: net_ether_01 State: UP

Address: 10.1.50.199 Label: lpar1svc State: UP

Address: 10.1.50.31 Label: lpar1 State: UP

Node Name: lpar9 State: UP

Network Name: net_ether_01 State: UP

Address: 10.1.50.39 Label: lpar9 State: UP

Cluster Name: STGMELB

Resource Group Name: RG1

Startup Policy: Online On Home Node Only

Fallover Policy: Fallover To Next Priority Node In The List

Fallback Policy: Never Fallback

Site Policy: ignore

Node Group State

---------------------------------------------------------------- ---------------

lpar1 UNMANAGED

lpar9 UNMANAGED

After the update, the cluster is stable and all resources are exactly where they should be.

[root@lpar1]/# cldump

Obtaining information via SNMP from Node: lpar1...

_____________________________________________________________________________

Cluster Name: STGMELB

Cluster State: UP

Cluster Substate: STABLE

_____________________________________________________________________________

Node Name: lpar1 State: UP

Network Name: net_ether_01 State: UP

Address: 10.1.50.199 Label: lpar1svc State: UP

Address: 10.1.50.31 Label: lpar1 State: UP

Node Name: lpar9 State: UP

Network Name: net_ether_01 State: UP

Address: 10.1.50.39 Label: lpar9 State: UP

Cluster Name: STGMELB

Resource Group Name: RG1

Startup Policy: Online On Home Node Only

Fallover Policy: Fallover To Next Priority Node In The List

Fallback Policy: Never Fallback

Site Policy: ignore

Node Group State

---------------------------------------------------------------- ---------------

lpar1 ONLINE

lpar9 OFFLINE

The cl_ezupdate process logs all its action to log files in /var/hacmp/EZUpdate, on the node from where the command was issued.

[root@lpar1]/var/hacmp/EZUpdate# ls -ltr

total 2336

-rw-r--r-- 1 root system 288218 Jan 17 14:35 EZUpdate.log.4

-rw-r--r-- 1 root system 288205 Jan 17 14:38 EZUpdate.log.3

-rw-r--r-- 1 root system 329460 Jan 17 14:52 EZUpdate.log.2

-rw-r--r-- 1 root system 488 Jan 17 15:08 EZUpdate.log.1

-rw-r--r-- 1 root system 278000 Jan 17 15:10 EZUpdate.log

You’ll notice, during the update, that the cl_ezupdate command is running the nimclient command to pull down and install the updates on all nodes.

root 15990974 16187598 0 14:39:22 - 0:00 /usr/sbin/nimclient -o cust -a installp_flags=agXw -a accept_licenses=yes -a lpp_source=pha721sp2 -a filesets=-f/tmp/pha721s

Rejecting Updates

You can also reject updates. I tested this by simply rejecting all the updates that were just applied (in the previous step). The cl_ezupdate –R command will reject the updates, on both nodes (by default. You can choose a subset of nodes if you wish).

[root@lpar1]/# cl_ezupdate -R -s pha721sp2

Checking for root authority...

Running as root.

Checking for AIX level...

The installed AIX version is supported.

Checking for PowerHA SystemMirror version.

The installed PowerHA SystemMirror version is supported.

Checking for clcomd communication on all nodes...

clcomd on each node can both send and receive messages.

INFO: The cluster: STGMELB is in state: STABLE

INFO: The node: lpar1 is in state: NORMAL

INFO: The node: lpar9 is in state: NORMAL

Checking for NIM servers...

Checking for lpps and Ifixes from source: pha721sp2...

Build lists of filesets that can be apply reject or commit on node lpar9

There is nothing to install on node: lpar9 from source: pha721sp2

Fileset list to reject or commit on node lpar9 = cluster.es.assist.db2 cluster.es.assist.sap cluster.es.assist.websphere cluster.es.client.lib cluster.es.cspoc.cmds cluster.es.cspoc.rte cluster.es.server.diag cluster.es.server.events cluster.es.server.rte cluster.es.server.utils cluster.es.smui.agent cluster.es.smui.common

Build lists of filesets that can be apply reject or commit on node lpar1

There is nothing to install on node: lpar1 from source: pha721sp2

Fileset list to reject or commit on node lpar1 = cluster.es.assist.db2 cluster.es.assist.sap cluster.es.assist.websphere cluster.es.client.lib cluster.es.cspoc.cmds cluster.es.cspoc.rte cluster.es.server.diag cluster.es.server.events cluster.es.server.rte cluster.es.server.utils cluster.es.smui.agent cluster.es.smui.common

Stopping the node lpar9...

Stopping PowerHA cluster services on node: lpar9 in offline mode...

lpar9: 0513-044 The clinfoES Subsystem was requested to stop.

lpar9: 0513-044 The clevmgrdES Subsystem was requested to stop.

"lpar9" is now offline.

lpar9: Jan 17 2018 15:10:08/usr/es/sbin/cluster/utilities/clstop: called with flags -N -g

Rejecting applied updates on node: lpar9...

Succeeded to reject updates on node: lpar9.

Starting the node: lpar9...

...

"lpar9" is now online.

Stopping the node lpar1...

Stopping PowerHA cluster services on node: lpar1 in unmanage mode...

"lpar1" is now unmanaged.

lpar1: Jan 17 2018 15:13:22/usr/es/sbin/cluster/utilities/clstop: called with flags -N -f

Rejecting applied updates on node: lpar1...

Succeeded to reject updates on node: lpar1.

......

"lpar1" is now online.

The updates have been rejected successfully and both nodes have returned to PowerHA 7.2.1 SP1.

[root@lpar1]/# clcmd halevel -s

-------------------------------

NODE lpar9.meldemo.au.ibm.com

-------------------------------

7.2.1 SP1

-------------------------------

NODE lpar1.meldemo.au.ibm.com

-------------------------------

7.2.1 SP1

I did notice that in both cases, applying or rejecting, that the clinfoES daemon was not restarted when the cluster returned from unmanaged to online mode. I simply restarted these services, manually. Could be a bug. I’ll need to dig deeper.

[root@lpar1]/# clcmd startsrc -s clinfoES

-------------------------------

NODE lpar9.meldemo.au.ibm.com

-------------------------------

0513-059 The clinfoES Subsystem has been started. Subsystem PID is 16252936.

-------------------------------

NODE lpar1.meldemo.au.ibm.com

-------------------------------

0513-059 The clinfoES Subsystem has been started. Subsystem PID is 13762662.

[root@lpar1]/# clstat -o

clstat - PowerHA SystemMirror Cluster Status Monitor

-------------------------------------

Cluster: STGMELB (1529383297)

Wed Jan 17 15:20:13 2018

State: UP Nodes: 2

SubState: STABLE

Node: lpar1 State: UP

Interface: lpar1 (0) Address: 10.1.50.31

State: UP

Interface: lpar1svc (0) Address: 10.1.50.199

State: UP

Resource Group: RG1 State: On line

Node: lpar9 State: UP

Interface: lpar9 (0) Address: 10.1.50.39

State: UP

If you run into any issues with cl_ezupdate, check the log files for obvious problems.

[root@lpar1]/var/hacmp/EZUpdate# tail -100 EZUpdate.log.4

<< End of Success Section >>

+-----------------------------------------------------------------------------+

BUILDDATE Verification ...

+-----------------------------------------------------------------------------+

Verifying build dates...done

FILESET STATISTICS

------------------

12 Selected to be installed, of which:

12 Passed pre-installation verification

----

12 Total to be installed

RESOURCES

---------

Estimated system resource requirements for filesets being installed:

(All sizes are in 512-byte blocks)

Filesystem Needed Space Free Space

/ 16 1356624

/usr 357688 1127816

----- -------- ------

TOTAL: 357704 2484440

NOTE: "Needed Space" values are calculated from data available prior

to installation. These are the estimated resources required for the

entire operation. Further resource checks will be made during

installation to verify that these initial estimates are sufficient.

******************************************************************************

End of installp PREVIEW. No apply operation has actually occurred.

******************************************************************************'

INF: Succeeded to install preview updates on node: lpar1.

INF: _.clrsh_cmd()[1126](46.804): execute cmd: rm /tmp/pha721sp2_750lpar4.bnd 2>&1

INF: _.clrsh_cmd()[1151](46.811): command returns code:0

INF: _.clrsh_cmd()[1152](46.811): command output=''

INF: cleanup()[171](46.812): Entered cleanup

### 2018_01_17 14:35 - Leaving script: cl_ezupdate -P -V -S pha721sp2 rc=0

Comments (1)

Add a Comment

Quarantine this Entry

Mike Coffey commented Feb 14 2018 Comment Permalink

Release 7.2.2 added the ability to have an alt_disk_copy of rootvg and restore rootvg if errors are encountered.

Blogs

Chris's AIX Blog

About this blog

Related posts

Testing AIX Live Upd...

HACMP 同窓会

IBM Storage Insights...

IBM Spectrum Control...

IBM Power Systems Bi...

Tags

Selected Tags

Related Tags

Updating your PowerHA cluster with cl_ezupdate

Send Email Notification

Quarantine this entry

Mark as Duplicate

Comments (1)