NIMSH, SSL and LPM.Do you use SSL with nimsh on AIX? No? Well, you might want to consider it. If you regularly use LPM to migrate AIX partitions from one server to another, you may have found that, on occasion, your NIM master has trouble communicating with its NIM clients afterwards. This is by design, as nimsh uses the NIM clients cpuid to authenticate with the NIM master. During an LPM operation, the cpuid of the NIM client changes and its possible the NIM master may reject the client as a result. This problem can occur even when CPU validation is disabled on the NIM master.
In the example below, I’ve LPM’ed a NIM client (750lpar1) to another server. Immediately afterwards, I’m able to execute a NIM command against the NIM client, from the NIM master (750lpar4). At this point the NIM client is configured with standard nimsh authentication i.e. no SSL.
NIM CLIENT (AFTER LPM): [root@750lpar1]/ # uname -a AIX 750lpar1 1 7 00F603CD4C00
[root@750lpar1]/ # uname -a AIX 750lpar1 1 7 00F627664C00
[roo
NIM MASTER: [root@750lpar4]/ # nim -o change -a validate_cpuid=no master [root@750lpar4]/ # lsnim -l master | grep -i cpu validate_cpuid = no
[root@750lpar4]/ # lsnim -l 750lpar1 750lpar1: class = machines type = standalone connect = nimsh platform = chrp netboot_kernel = 64 if1 = 10_1_50 750lpar1 0 cable_type1 = N/A Cstate = ready for a NIM operation prev_state = not running Mstate = currently running cpuid = 00F6
[root@750lpar4]/ # nim -o showlog 750lpar1 HELLO
The cpuid is cached by the nimsh daemon, so the previous system id is retained in memory and passed to the NIM master, which allows the operation to complete successfully. But, if I restart the nimsh daemon, on the NIM client, I find that the NIM master is no longer able to communicate with the client.
[root@750lpar1]/ # stopsrc -s nimsh 0513-044 The nimsh Subsystem was requested to stop.
[root@750lpar1]/ # startsrc -s nimsh 0513-059 The nimsh Subsystem has been started. Subsystem PID is 6160610.
[root@750lpar4]/ nim -o define -t mksysb -a source=750lpar1 -a mksysb_flags=-e -a mk_image=yes -a server=master -a loca 0042-001 nim: processing error encountered on "master": 0042-006 m_mkbosi: (From_Master) connect Error 0
0042-008 nimsh: Request denied - 750lpar4
[root@750lpar4]/ # nim -o showlog 750lpar1 0042-001 nim: processing error encountered on "master": 0042-006 m_showlog: (From_Master) connect Error 0
0042-008 nimsh: Request denied - 750lpar4
[roo Wed Apr 27 20:51:12 2016 error: local value passed, '00F603CD4C00', does not match environment value '00F627664C00'
One way to work around this problem is to use the procedure outlined at the following link:
http
“Migrating a NIM client by using LPM
When Live Partition Mobility (LPM) is used to move a machine from one physical server to another and the machine is defined as a Network Installation Management (NIM) client, the NIM administrator must update the cpuid attribute for the NIM client to reflect the new hardware value after the LPM migration completes. To update the cpuid attribute, complete the following steps:
On the NIM client, acquire the new cpuid ID by running the following command:
uname –a
On the NIM master, run the following command:
nim -o change -a cpuid=cpuid client”
However, there is a better way. Using nimsh, with SSL-enabled authentication, will prevent the checking of cpuid during nimsh service handling. This is considered the recommended choice of operation (since the client/server can agree upon identity using the certificate information passed during the ssl handshake). Once the certificate is in place, the NIM master will disregard any cpuid validation and instead rely on the success of the SSL-handshake. This configuration will work well with LPM. If using standard nimsh, the limitation of cpuid updating would still apply (because NIM has no way of automatically updating a client once the value has changed).
Both the NIM master and client must be configured to support SSL-enabled authentication. To configure SSL-enabled authentication, we can use ‘nimconfig -c’ on the NIM master.
[root@750lpar4]/ # oslevel -s 7100-04-01-1543
[root@750lpar4]/ # nimconfig -c 0513-029 The tftpd Subsystem is already active. Multiple instances are not supported. NIM_ x - /usr x - /usr Target "all" is up to date. Generating a 1024 bit RSA private key .... .... writing new private key to '/ss ----- Signature ok subj Getting Private key Generating a 1024 bit RSA private key ...++++++ ....++++++ writing new private key to '/ss ----- Signature ok subj Getting CA Private Key Generating a 1024 bit RSA private key .... ..........++++++ writing new private key to '/ss ----- Signature ok subj Getting CA Private Key
[root@750lpar4]/ # lsnim -l | grep -i ssl ssl_support = yes
[root@750lpar4]/ # /usr certname= /ssl subject= /C=U issuer= /C=U notAfter=Apr 28 10:25:17 2017 GMT
[root@750lpar4]/ # /usr certname= /tft subject= /C=U issuer= /C=U notAfter=Apr 28 10:25:17 2017 GMT
To enable SSL nimsh on the NIM client, we can use the ‘nimclient –c’ command. You can fall-back to non-SSL nimsh, with ‘nimclient –C’.
[root@750lpar1]/# oslevel -s 7100-04-01-1543
[roo x - /usr x - /usr Received 2784 Bytes in 0.0 Seconds 0513-044 The nimsh Subsystem was requested to stop. 0513-077 Subsystem has been changed. 0513-059 The nimsh Subsystem has been started. Subsystem PID is 14942422.
[roo Fri Apr 29 23:10:45 2016 /usr/sbin/nimsh: NIM Service Handler started from SRC Fri Apr 29 23:10:45 2016 no environment value for NIM_SECONDARY_PORT Fri Apr 29 23:10:45 2016 value for hostname is 750lpar1 Fri Apr 29 23:10:45 2016 value for netaddr is: 10.1.50.31 Fri Apr 29 23:10:45 2016 value for route is net,-hopcount,0,,0 and gateway is 10.1.50.1 Fri Apr 29 23:10:45 2016 value for netif is en0 Fri Apr 29 23:10:45 2016 value for netmask is 255.255.255.0 Fri Apr 29 23:10:45 2016 obtained master's hostname: NIM_
Fri Apr 29 23:10:45 2016 obtained master's id: NIM_
Fri Apr 29 23:10:45 2016 value for machine id is 00F603CD4C00 Fri Apr 29 23:10:45 2016 Refreshing archive member /usr Fri Apr 29 23:10:45 2016 Refreshing archive member /usr
Now I can LPM the NIM client to another server and even if I restart nimsh, the master can still communicate with the client. All thanks to SSL.
NIM MASTER:
[root@750lpar4]/ # lsnim -l 750lpar1 750lpar1: class = machines type = standalone connect = nimsh (secure) platform = chrp netboot_kernel = 64 if1 = 10_1_50 750lpar1 0 cable_type1 = N/A Cstate = ready for a NIM operation prev_state = not running Mstate = currently running cpuid = 00F603CD4C00
NIM CLIENT (AFTER LPM): [root@750lpar1]/ # uname -a AIX 750lpar1 1 7 00F603CD4C00
[root@750lpar1]/ # uname -a AIX 750lpar1 1 7 00F627664C00
[root@lpar1]/ # stopsrc -s nimsh 0513-044 The nimsh Subsystem was requested to stop.
[root@lpar1]/ # startsrc -s nimsh 0513-059 The nimsh Subsystem has been started. Subsystem PID is 6160616.
[roo Fri Apr 29 23:13:33 2016 passing OpenSSL setting of 1 Fri Apr 29 23:13:33 2016 set symbol table Fri Apr 29 23:13:33 2016 cert filename discovered: /ssl Fri Apr 29 23:13:33 2016 ** OpenSSL FIPS mode enabled successfully Fri Apr 29 23:13:33 2016 seed_prng Fri Apr 29 23:13:33 2016 Loading certificates.. Fri Apr 29 23:13:33 2016 Loading private key file.. Fri Apr 29 23:13:33 2016 create BIO Fri Apr 29 23:13:33 2016 - SSL Connection verified successfully - Fri Apr 29 23:13:33 2016 sending ack to client Fri Apr 29 23:13:33 2016 setting descriptors to include 2nd port Fri Apr 29 23:13:33 2016 command to exec __ /usr
[root@750lpar4]/ nim -o define -t mksysb -a source=750lpar1 -a mksysb_flags=-e -a mk_image=yes -a server=master -a loca
+--- System Backup Image Space Information (Sizes are displayed in 1024-byte blocks.) +---
Required = 9457741 (9237 MB) Available = 51653640 (50444 MB)
Creating information file (/image.data) for rootvg. ..etc..
Using NIM to install clients configured with SSL authentication http
Using the certificate viewing file
|
This process has saved our collective backsides in a data center migration, small env of 48 AIX VM BUT STILL................
This whole thing was a foobar until i came across this method, MANY THX!!!!!!!!
LT
I'm glad this post helped! Thanks for letting me know.
I really like the idea of not having my mksysb operations fail after an LPM. My concern is we've seen so many bugs with openssl that, from a security standpoint, are we opening ourselves to hacking by enabling this SSL authentication? Also, how does this affect NIM installs? Can we do NIM installs with SSL authentication?
Hi Chris/Matt, I thought I solved this issue as well by turning off CPU validation and did some testing (successfully). With CPU validation turned off, you may think that nim still works when you lpm an lpar to another frame because nim operation still works. But when I lpm'ed the lpar back I started getting the error about cpuid even if it's back on the original frame. Ofcourse, you can resolve this by simply resetting nimsh etc. But what if you don't want to manually fix this and not worry about it every time you lpm an lpar. Great artice again Chris!
Hi Chris, great article, but why not just turn off CPU validation entirely? - "nim -o change -a validate_cpuid=no master"