Re: Intel power tuning - 30% throughput performance increase

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Wed, 3 May 2017 09:43:18 +0200

Hi Blair,

We use cpu_dma_latency=1, because it was in the latency-performance profile.
And indeed by setting cpu_dma_latency=0 on one of our OSD servers,
powertop now shows the package as 100% in turbo mode.

So I suppose we'll pay for this performance boost in energy.
But more importantly, can the CPU survive being in turbo 100% of the time?

-- Dan

On Wed, May 3, 2017 at 9:13 AM, Blair Bethwaite
<blair.bethwaite@xxxxxxxxx> wrote:
> Hi all,
>
> We recently noticed that despite having BIOS power profiles set to
> performance on our RHEL7 Dell R720 Ceph OSD nodes, that CPU frequencies
> never seemed to be getting into the top of the range, and in fact spent a
> lot of time in low C-states despite that BIOS option supposedly disabling
> C-states.
>
> After some investigation this C-state issue seems to be relatively common,
> apparently the BIOS setting is more of a config option that the OS can
> choose to ignore. You can check this by examining
> /sys/module/intel_idle/parameters/max_cstate - if this is >1 and you *think*
> C-states are disabled then your system is messing with you.
>
> Because the contemporary Intel power management driver
> (https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt) now
> limits the proliferation of OS level CPU power profiles/governors, the only
> way to force top frequencies is to either set kernel boot command line
> options or use the /dev/cpu_dma_latency, aka pmqos, interface.
>
> We did the latter using the pmqos_static.py, which was previously part of
> the RHEL6 tuned latency-performance profile, but seems to have been dropped
> in RHEL7 (don't yet know why), and in any case the default tuned profile is
> throughput-performance (which does not change cpu_dma_latency). You can find
> the pmqos-static.py script here
> https://github.com/NetSys/NetBricks/blob/master/scripts/tuning/pmqos-static.py.
>
> After setting `./pmqos-static.py cpu_dma_latency=0` across our OSD nodes we
> saw a conservative 30% increase in backfill and recovery throughput - now
> when our main RBD pool of 900+ OSDs is backfilling we expect to see ~22GB/s,
> previously that was ~15GB/s.
>
> We have just got around to opening a case with Red Hat regarding this as at
> minimum Ceph should probably be actively using the pmqos interface and tuned
> should be setting this with recommendations for the latency-performance
> profile in the RHCS install guide. We have done no characterisation of it on
> Ubuntu yet, however anecdotally it looks like it has similar issues on the
> same hardware.
>
> Merry xmas.
>
> Cheers,
> Blair
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com