saposcol and shared memory on AIX.If you are
a system administrator that is responsible for managing AIX systems that run
SAP, then you’ve probably had an experience similar to the following?   OK, so one
day my SAP Basis administrator contacts me and says “I can’t start saposcol......can
you please reboot the system?” I quickly reply, “Was there an error message when trying to restart saposcol?”. He
replies, “No”. Again I return very
quickly, “OK, have you checked to see if
there are any shared memory segments left for saposcol?”.  Just as quick, he replies “How do I do that?” Together
we try starting saposcol and what we
find is that it thinks it’s already running (as shown below, PID 327924). But there is no such process! $ saposcol
-l -l 09:37:24
22.01.2009   LOG: Effective User Id is
root **** *
This is Saposcol Version COLL 20.95 700 - AIX v11.15 5L-64 bit 080317 *
Usage:  saposcol -l: Start OS Collector *         saposcol -k: Stop  OS Collector *         saposcol -d: OS Collector Dialog Mode *         saposcol -s: OS Collector Status * The OS
Collector (PID 327924) is already running ..... **** $
ps -fp 327924      UID    
PID    PPID   C   
STIME    TTY  TIME CMD So we try
to stop saposcol. This of course fails, as there is no PID 327924 running! $
saposcol -k Setting
Stop Flag : 09:19:29
22.01.2009   LOG: ==== Stop Flag was set
by saposcol (kill_collector()). 09:19:29
22.01.2009   LOG: ====  The collection process will stop as soon as possible **** can't kill
process 327924. kill: No such
process ERROR:No
reaction from collecting process 327924. Please
kill collecting process. My
conclusion is that there must be a shared memory segment still allocated for
saposcol. There were many other SAP processes still running happily, so there
were several shared memory segments to sift through. So, what shared memory ID
does saposcol use? Now, according
to the following website, shared memory key 4dbe is used by saposcol on AIX. http 
 So I run ipcs to check for the existence of 4dbe. And I find an entry for this key. There are
several process id’s ‘attached’ to this segment. However, only one of them actually
exists (PID 2293794). #
ipcs -ma | grep 4dbe m   2097156 0x00004dbe --rw-rw-rw-     root  
sapsys     root   sapsys     
0   1870188
467164 2293794  9:39:43 
9:39:43  8:12:43 #
ps -fp 1870188      UID    
PID    PPID   C   
STIME    TTY  TIME CMD #
ps -fp 467164      UID    
PID    PPID   C   
STIME    TTY  TIME CMD #
ps -fp 2293794      UID    
PID    PPID   C   
STIME    TTY  TIME CMD   sapadm 2293794      
1   0 19:26:41      - 
1:07 sapccm4x -DCCMS pf=/ I ask the
SAP admin to stop this process, which he does. Now I remove the shared memory
segment and there is no evidence of 4dbe in the ipcs output. #
ipcrm -m 2097156 #
ipcs -ma | grep 4dbe We were
then able to start saposcol again
with success. The process is running and the shared memory segment 4dbe has returned. #
ipcs -ma | grep 4dbe m   3145732 0x00004dbe --rw-rw-rw-     root  
sapsys     root   sapsys     
1   1870188 2293818 2207892
10:14:25 10:14:25 10:12:47 #
ps -ef | grep oscol   sapadm 2064616       1  
0 10:12:53      -  0:00 saposcol -l      #
/usr **** Collector
Versions :   running : COLL 20.95 700 - AIX v11.15 5L-64
bit 080317   dialog 
: COLL 20.95 700 - AIX v11.15 5L-64 bit 080317 Shared
Memory       : attached Number
of records   : 17640 Active
Flag         : active (01) Operating
System    : AIX aix01 3 5 00C01C704C00 Collector PID       : 2064616 (001F80E8) Collector           : running Start
time coll.    : Thu Jan 22 10:12:53 2009 Current
Time        : Thu Jan 22 10:17:46 2009 Last
write access   : Thu Jan 22 10:17:38 2009 Last
Read  Access   : Thu Jan 22 10:16:16 2009 Collection
Interval : 10 sec (next delay). Collection
Interval : 10 sec (last ). Status              : free Collect
Details     : required Refresh             : required Header
Extention Structure Number
of x-header      Records : 1 Number
of Communication Records : 60 Number
of free Com.     Records : 60 Resulting
offset to 1.data rec. : 61 Trace
level             : 2 Collector
in IDLE - mode ? : NO   become idle after 300 sec without read
access.   Length of Idle Interval  : 60 sec   Length of norm.Interval  : 10 sec **** Problem
solved and no reboot required.....this is AIX after all!  :)  | 
    
Worked like a charm on a Prod SAP NetWeaver system I'm helping with. I wonder whether the initial problem was because they killed saposcol with a kill -9, which may not have cleaned up shared memory. Anyway, problem solved. Thanks, Chris.
_files/blank.gif)