Starting with AIX 6.1 TL08 and AIX 7.1 TL02 theres a new AIX CPU tuning feature called Scaled Throughput mode. This is supported on POWER7 and POWER7+ processors only (do not try this on POWER6!). This new mode has the ability to dispatch workload to more SMT threads per VP, avoiding the need to unfold additional VPs. Ive heard it described as being more POWER6 like. Im not suggesting that you use this feature. This post simply discusses what this new mode can do.

By default AIX (on POWER7) operates in Raw Throughput mode. This mode provides the best performance per thread per core. It offers the best response times. It utilises more cores (VPs) to process a systems workload. By comparison, Scaled Throughput provides a greater level of per core throughput (processing) by dispatching more SMT threads on a core. This has the effect of utilising fewer VPs/cores. In this mode, more (or all) SMT threads per core will be utilised before dispatching workload to other VPs/core in the system.

The schedo tuning command can be used to enable the new mode using a new parameter called vpm_throughput_mode. e.g.

# schedo p o vpm_throughput_mode=X

This tunable can be set to one of the following values:

0 = Legacy Raw mode (default).

1= Scaled or Enhanced Raw mode with a higher threshold than legacy.

2 = Scaled mode, use primary and secondary SMT threads.

4 = Scaled mode, use all four SMT threads.

At this stage, this tunable is not restricted, but if you plan on experimenting with it please be careful; make sure you understand how this tuning may impact your system; always test new tuning in a non-production environment first!

I performed a couple of quick tests today, just to see what impact tuning the parameter would have on an AIX 7.1 TL2 system.

aixlpar1 : / # oslevel -s

7100-02-01-1245

aixlpar1 : / # lsconf | grep Mode

System Model: IBM,9119-FHB

Processor Implementation Mode: POWER 7

I started some CPU intensive workload.

Name PID CPU% PgSp Owner CliV2 0 WPAR Total 0

ncpu 4718826 15.7 108K db SerV3 0 Press: "h"-help

ncpu 7143668 15.7 108K db CliV3 0 "q"-quit

ncpu 8126488 15.7 108K db

ncpu 5832920 15.6 108K db

The vpm_throughput_mode parameter was left at the default value (0).

# schedo -a | grep vpm_throughput_mode

vpm_throughput_mode = 0

As expected, the workload was evenly dispatched across each of the primary SMT threads of the 4 VPs assigned to the partition i.e. logical CPU 0, 4, 8 and 12. None of the secondary or tertiary SMT threads were active. This is the default mode and will provide the greatest raw throughput (performance) per VP (as theres no overhead associated with enabling secondary or tertiary SMT threads).

Topas Monitor for host:aixlpar1 EVENTS/QUEUES FILE/TTY

Fri Dec 21 11:06:34 2012 Interval:2 Cswitch 184 Readch 864

Syscall 299 Writech 651

CPU User% Kern% Wait% Idle% Physc Reads 111 Rawin 0

0 99.7 0.3 0.0 0.0 0.63 Writes 1 Ttyout 233

1 0.2 0.5 0.0 99.3 0.12 Forks 0 Igets 0

2 0.0 0.0 0.0 100.0 0.12 Execs 0 Namei 9

3 0.0 0.0 0.0 100.0 0.12 Runqueue 4.00 Dirblk 0

4 100.0 0.0 0.0 0.0 0.63 Waitqueue 0.0

5 0.0 0.0 0.0 100.0 0.12 MEMORY

6 0.0 0.0 0.0 100.0 0.12 PAGING Real,MB 4096

7 0.0 0.0 0.0 100.0 0.12 Faults 0 % Comp 22

8 100.0 0.0 0.0 0.0 0.63 Steals 0 % Noncomp 2

9 0.0 0.0 0.0 100.0 0.12 PgspIn 0 % Client 2

10 0.0 0.0 0.0 100.0 0.12 PgspOut 0

11 0.0 0.0 0.0 100.0 0.12 PageIn 0 PAGING SPACE

12 100.0 0.0 0.0 0.0 0.63 PageOut 0 Size,MB 2048

13 0.0 0.0 0.0 100.0 0.12 Sios 0 % Used 0

14 0.0 0.0 0.0 100.0 0.12 % Free 100

15 0.0 0.0 0.0 100.0 0.12 NFS (calls/sec)

Next, we enabled scaled throughput mode (2). The workload slowly migrated to logical CPUs 4, 5, 8 and 9. So now only two primary SMT threads were active (lcpus 4 and 8) and two secondary threads were active (lcpus 5 and 9). All the processing was being performed by fewer VPs (almost like POWER6).

# schedo -p -o vpm_throughput_mode=2

Topas Monitor for host:aixlpar1 EVENTS/QUEUES FILE/TTY

Fri Dec 21 11:07:36 2012 Interval:2 Cswitch 179 Readch 935

Syscall 301 Writech 794

CPU User% Kern% Wait% Idle% Physc Reads 112 Rawin 0

0 14.1 60.3 0.0 25.6 0.00 Writes 2 Ttyout 304

1 5.7 32.8 0.0 61.5 0.00 Forks 0 Igets 0

2 0.0 1.6 0.0 98.4 0.00 Execs 0 Namei 10

3 0.0 3.0 0.0 97.0 0.00 Runqueue 4.00 Dirblk 0

4 100.0 0.0 0.0 0.0 0.47 Waitqueue 0.0

5 100.0 0.0 0.0 0.0 0.47 MEMORY

6 0.0 0.0 0.0 100.0 0.03 PAGING Real,MB 4096

7 0.0 0.0 0.0 100.0 0.03 Faults 0 % Comp 22

8 100.0 0.0 0.0 0.0 0.47 Steals 0 % Noncomp 2

9 100.0 0.0 0.0 0.0 0.47 PgspIn 0 % Client 2

10 0.0 0.0 0.0 100.0 0.03 PgspOut 0

11 0.0 0.0 0.0 100.0 0.03 PageIn 0 PAGING SPACE

12 0.0 50.6 0.0 49.4 0.00 PageOut 0 Size,MB 2048

15 0.0 13.3 0.0 86.7 0.00 Sios 0 % Used 0

14 0.0 0.8 0.0 99.2 0.00 % Free 100

15 0.0 0.8 0.0 99.2 0.00 NFS (calls/sec)

And finally, I tried scaled mode with all four SMT threads (4). All of the workload was migrated to a single VP but all 4 SMT threads were being utilised (primary SMT thread lcpu 0, secondary/tertiary SMT threads 1, 2 &3). This mode offers lower overall core consumption but has the (possibly negative) side effect of enabling more SMT threads on a single VP/core, which may not perform as well as the same workload evenly dispatched to 4 individual VPs/cores (on primary SMT threads).

# schedo -p -o vpm_throughput_mode=4

Topas Monitor for host:aixlpar1 EVENTS/QUEUES FILE/TTY

Fri Dec 21 11:08:30 2012 Interval:2 Cswitch 199 Readch 385

Syscall 149 Writech 769

CPU User% Kern% Wait% Idle% Physc Reads 2 Rawin 0

0 99.7 0.3 0.0 0.0 0.25 Writes 1 Ttyout 375

1 99.9 0.1 0.0 0.0 0.25 Forks 0 Igets 0

2 100.0 0.0 0.0 0.0 0.25 Execs 0 Namei 6

3 100.0 0.0 0.0 0.0 0.25 Runqueue 4.00 Dirblk 0

4 0.0 48.8 0.0 51.2 0.00 Waitqueue 0.0

5 0.0 5.5 0.0 94.5 0.00 MEMORY

6 0.0 3.6 0.0 96.4 0.00 PAGING Real,MB 4096

7 0.0 3.3 0.0 96.7 0.00 Faults 0 % Comp 22

8 0.0 77.1 0.0 22.9 0.00 Steals 0 % Noncomp 2

9 0.0 52.9 0.0 47.1 0.00 PgspIn 0 % Client 2

10 0.0 44.8 0.0 55.2 0.00 PgspOut 0

11 0.0 38.5 0.0 61.5 0.00 PageIn 0 PAGING SPACE

12 0.0 51.7 0.0 48.3 0.00 PageOut 0 Size,MB 2048

13 0.0 8.9 0.0 91.1 0.00 Sios 0 % Used 0

14 0.0 7.0 0.0 93.0 0.00 % Free 100

15 0.0 7.4 0.0 92.6 0.00 NFS (calls/sec)

For more information on Scaled Throughput mode, take a look at the following presentation:

http://t.co/jVtJ40Ty