Hi all,
We recently noticed that despite having BIOS power profiles set to performance on our RHEL7 Dell R720 Ceph OSD nodes, that CPU frequencies never seemed to be getting into the top of the range, and in fact spent a lot of time in low C-states despite that BIOS option supposedly disabling C-states.
After some investigation this C-state issue seems to be relatively common, apparently the BIOS setting is more of a config option that the OS can choose to ignore. You can check this by examining /sys/module/intel_idle/parameters/max_cstate - if this is >1 and you *think* C-states are disabled then your system is messing with you.
Because the contemporary Intel power management driver (https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt) now limits the proliferation of OS level CPU power profiles/governors, the only way to force top frequencies is to either set kernel boot command line options or use the /dev/cpu_dma_latency, aka pmqos, interface.
We did the latter using the pmqos_static.py, which was previously part of the RHEL6 tuned latency-performance profile, but seems to have been dropped in RHEL7 (don't yet know why), and in any case the default tuned profile is throughput-performance (which does not change cpu_dma_latency). You can find the pmqos-static.py script here https://github.com/NetSys/NetBricks/blob/master/scripts/tuning/pmqos-static.py.
After setting `./pmqos-static.py cpu_dma_latency=0` across our OSD nodes we saw a conservative 30% increase in backfill and recovery throughput - now when our main RBD pool of 900+ OSDs is backfilling we expect to see ~22GB/s, previously that was ~15GB/s.
We have just got around to opening a case with Red Hat regarding this as at minimum Ceph should probably be actively using the pmqos interface and tuned should be setting this with recommendations for the latency-performance profile in the RHCS install guide. We have done no characterisation of it on Ubuntu yet, however anecdotally it looks like it has similar issues on the same hardware.
Merry xmas.
Cheers,
Blair
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com