High End
POWER7 systems, like the POWER7 795, support configurations of up to 256
physical processors. Most of us can only dream of working on a system of this
size. For those of us that are fortunate enough to play with these big boxes,
you may find collecting and reviewing CPU related performance data a challenging
task.
With 256
physical POWER7 processors, where each processor has 4 threads enabled (SMT 4),
there will be up to 1024 logical CPUs active on the system. As a consequence, a
lot of performance monitoring data will be generated, which presents challenges
to the OS to filter and effectively present relevant performance data to the
administrator.
Many AIX
performance tools provide per-CPU statistics for the administrator to analyse
application and system performance. With the introduction of 1024 logical CPUs
on a system, these tools are impacted. All standard AIX performance tools have
been modified in AIX 7.1 (and AIX 6.1 TL6) to support AIX partitions running as
many as 1024 logical CPUs on a POWER7 system.
The
following tools have been modified to support the increase in logical CPUs and
now provide options for filtering and sorting data: mpstat, sar and topas.
Both the mpstat and sar commands have been updated to allow for the sorting and
filtering of output. The –O flag provides
this new feature. The following options can be supplied to the –O flag (from the man page):
-OOptions
Specifies the command option.
-O options=value...
Following are the supported
options:
* sortcolumn = Name of the
metrics in the mpstat command output
* sortorder = [asc|desc]
* topcount = Number of CPUs
to be displayed in the mpstat command sorted output
For example,
to see the sorted mpstat output for
the cs column you would enter the
following command.
# mpstat -d
-O sortcolumn=cs 1 3
Another
example, to see the list of
the top 10 CPUs, you would enter the following command.
# mpstat -a
-O sortcolumn=min,sortorder=desc,topcount=10 1 3
Some
examples of this new option with the sar
command are shown below.
-OOptions
Allows users to specify the command
option.
-O options=value...
Following are the supported
options:
* sortcolumn = Name of the
metrics in the sar command output
* sortorder = [asc|desc]
* topcount = Number of CPUs
to be displayed in the sar command sorted output
To display the sorted sar output for the column cswch/s
with the -w flag, you would enter the following command:
# sar -w -P
ALL -O sortcolumn=cswch/s 1 3
To list the top ten CPUs, sorted on
the scall/s column, you would enter the following command:
# sar -c -O sortcolumn=scall/s,sortorder=desc,topcount=10 -P ALL 1 3
Some
performance tools have also been enhanced to provide the capability to generate
XML reports. The following tools now have this feature: sar, mpstat, vmstat, iostat and lparstat.
Specifying the –X flag with these
commands will produce XML output. You can specify an output file with the –o flag.
If you don’t specify an output file, a default file is generated with the
following naming convention, command_DDMMYYHHMM.xml.
Here are
some examples of using the –X flag
with the updated commands:
# lparstat
-X
# sar -X 1
5
# mpstat -X
1 5
# vmstat -X
1 3
# iostat -X
1 3
*** stack
smashing detected ***: program terminated
IOT/Abort
trap(coredump)
#
# ls -lr
*.xml
-rw-r--r-- 1 root
system 50027 Nov 25 22:00 vmstat.xml
-rw-r--r-- 1 root
system 50852 Nov 25 21:59
sar_2511102159.xml
-rw-r--r-- 1 root
system 44570 Nov 25 21:59
mpstat_2511102159.xml
-rw-r--r-- 1 root
system 18926 Nov 25 21:59
lparstat_2511102159.xml
In my tests
I noticed a couple of things. First the vmstat
output is named vmstat.xml, this
doesn’t match the expected naming convention. This could be a bug? Also, the iostat command dumped core. This is
meant to be supported so this was totally unexpected. And yes I was running the
latest service pack for AIX 7.1 (7100-00-01-1037). This could also be a bug. Anyway,
I digress!
Here are
some samples of the content of the generated XML files.
# head -20 lparstat_2511102159.xml
<?xml
version="1.0" encoding="UTF-8"?>
<PerformanceMeasurement
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="lparstat_schema.xsd" >
<ProcessingHeader>
<Command>lparstat</Command>
<XmlFormatVersion>1.0</XmlFormatVersion>
<TimeStamp
Name="GeneratedOn"
Type="UtcSecs" >
1290743958
</TimeStamp>
<TimeStamp
Name="GeneratedOn"
Type="PrintableDate" >
Thu Nov 25
21:59:18 2010
</TimeStamp>
</ProcessingHeader>
<NodeDescription
NodeClass="POWER_6"
NodeId="9" >
<HardwareDescription
HardwareClass="POWER_6" >
<CpuClockMhz>3550.000000</CpuClockMhz>
<SystemMemory>6144</SystemMemory>
<TaggedNotes>
# tail -20 sar_2511102159.xml
<CPUUtil
CPUID="system" >
<ContextSwitch>3071</ContextSwitch>
</CPUUtil>
</CPUStats>
</SyswideContextSwitch>
</AvgPerProcessorContextSwitch>
<AvgTTYActivity>
<CanChar>0</CanChar>
<ModemIntr>0</ModemIntr>
<OutQueChars>0</OutQueChars>
<InputQueChars>0</InputQueChars>
<ttyRecvIntr>0</ttyRecvIntr>
<TransmitIntr>0</TransmitIntr>
</AvgTTYActivity>
</SarAverageData>
</CollectionDataSet>
</PerformanceMeasurement>
#
# head -40 mpstat_2511102159.xml
<?xml
version="1.0" encoding="UTF-8"?>
<PerformanceMeasurement
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="mpstat_schema.xsd"
>
<ProcessingHeader>
<Command>mpstat</Command>
<XmlFormatVersion>1.0</XmlFormatVersion>
<TimeStamp
Name="GeneratedOn" Type="UtcSecs"
>
1290743975
</TimeStamp>
<TimeStamp
Name="GeneratedOn"
Type="PrintableDate" >
Thu Nov 25
21:59:35 2010
</TimeStamp>
</ProcessingHeader>
<NodeDescription
NodeClass="POWER_6"
NodeId="9" >
<HardwareDescription
HardwareClass="POWER_6" >
<CpuClockMhz>3550.000000</CpuClockMhz>
<SystemMemory>6144</SystemMemory>
<TaggedNotes>
<NoteTag Key="ChipType" >N/A</NoteTag>
<NoteTag Key="ChipRevision" >N/A</NoteTag>
<NoteTag Key="SerialNumber" >N/A</NoteTag>
<NoteTag Key="SystemType" >N/A</NoteTag>
<NoteTag Key="SystemModel" >N/A</NoteTag>
</TaggedNotes>
</HardwareDescription>
<LogicalPartitionDescription Hostname="l273pp007_pub" LparId="9" >
<HardwareDescription>
<NumberConfiguredProcessors>32</NumberConfiguredProcessors>
</HardwareDescription>
<LogicalPartitionConfiguration Name="l273pp007" >
<LogicalPartitionType>Shared</LogicalPartitionType>
<CappedAttribute>Uncapped</CappedAttribute>
<EntitledCapacity>5000</EntitledCapacity>
<UncappedWeight>128</UncappedWeight>
</LogicalPartitionConfiguration>
<SoftwareDescription>
#
And one last
feature that I thought I’d mention. The topas
command has been enhanced with two new useful features
·
topas
panel freezing [ 'Space Bar' is used a toggle for freezing]
Topas Monitor for host:l273pp007 EVENTS/QUEUES FILE/TTY
Thu Nov 25 21:54:56 2010 Interval:FROZEN Cswitch
15.2G Readch 578.1G
Syscall 26.2G Writech
103.4G
CPU User%
Kern% Wait% Idle% Physc Entc%
Reads 439.4M Rawin
170.6K
Total
37.6 62.4 0.0
0.0 3.10 620.80 Writes
533.6M Ttyout 791.8M
Forks 5996.3K
Igets 2658.0K
Network
BPS I-Pkts O-Pkts
B-In B-Out Execs
7654.3K Namei 2048.0M
Total
19.9K 0 178.5
0 19.9K Runqueue
12.0M Dirblk 18.1M
Waitqueue 17938.1
Disk
Busy% BPS TPS
B-Read B-Writ MEMORY
Total
0.0 0 0
0 0 PAGING Real,MB 6144
Faults 1753.2M % Comp 29
FileSystem
BPS TPS B-Read
B-Writ Steals 0
% Noncomp 12
Total
624.7 89.24 624.7
0 PgspIn 0
% Client 11
PgspOut 0
WLM-Class (Active) CPU%
Mem% Blk-I/O% PageIn
18.9M PAGING SPACE
System 1 21
0 PageOut 50.4M Size,MB 1536
Default 0 2
0 Sios 47.1M % Used 1
%
Free 99
Name
PID CPU% PgSp Class NFS (calls/sec)
sendmail
7340250 0.0 924K 52wpar SerV2 0
WPAR Activ 3
cron
7471202 0.0 324K 52wpar CliV2 0
WPAR Total 3
IBM.ERrm
10092604 0.0 1.83M 52wpar SerV3 0
Press: "h"-help
syslogd
7733322 0.0 452K wpar52 CliV3 713.8K "q"-quit
sshd
7864478 0.0 796K System
srcmstr
7929858 0.0 1.10M wpar1
inetd
8126494 0.0 1016K wpar1
init
8192026 0.0 1.05M wpar1
lockd-1
8257544 0.0 1.19M System
pilegc
786456 0.0 640K System
·
topas
panel scrolling and sorting [PgUp/PgDn keys are be used for scrolling]