We also noticed a tremendous gain in latency performance by setting cstates to processor.max_cstate=1 intel_idle.max_cstate=0. We went from being over 1ms latency for 4KB writes to well under (.7ms? going off mem). I will note that we did not have as much of a problem on Intel v3 procs, but on v4 procs, our low QD, single threaded write perf dropped tremendously. I don’t recall now, but it was much worse than just a 30% loss in perf compared to a v3 proc that had default C states set. We only saw a small bump in power usage as well. Bumping the CPU frequency up also offered a small performance change as well. Warren Wang Walmart ✻ On 5/3/17, 3:43 AM, "ceph-users on behalf of Dan van der Ster" <ceph-users-bounces@xxxxxxxxxxxxxx on behalf of dan@xxxxxxxxxxxxxx> wrote: Hi Blair, We use cpu_dma_latency=1, because it was in the latency-performance profile. And indeed by setting cpu_dma_latency=0 on one of our OSD servers, powertop now shows the package as 100% in turbo mode. So I suppose we'll pay for this performance boost in energy. But more importantly, can the CPU survive being in turbo 100% of the time? -- Dan On Wed, May 3, 2017 at 9:13 AM, Blair Bethwaite <blair.bethwaite@xxxxxxxxx> wrote: > Hi all, > > We recently noticed that despite having BIOS power profiles set to > performance on our RHEL7 Dell R720 Ceph OSD nodes, that CPU frequencies > never seemed to be getting into the top of the range, and in fact spent a > lot of time in low C-states despite that BIOS option supposedly disabling > C-states. > > After some investigation this C-state issue seems to be relatively common, > apparently the BIOS setting is more of a config option that the OS can > choose to ignore. You can check this by examining > /sys/module/intel_idle/parameters/max_cstate - if this is >1 and you *think* > C-states are disabled then your system is messing with you. > > Because the contemporary Intel power management driver > (https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt) now > limits the proliferation of OS level CPU power profiles/governors, the only > way to force top frequencies is to either set kernel boot command line > options or use the /dev/cpu_dma_latency, aka pmqos, interface. > > We did the latter using the pmqos_static.py, which was previously part of > the RHEL6 tuned latency-performance profile, but seems to have been dropped > in RHEL7 (don't yet know why), and in any case the default tuned profile is > throughput-performance (which does not change cpu_dma_latency). You can find > the pmqos-static.py script here > https://github.com/NetSys/NetBricks/blob/master/scripts/tuning/pmqos-static.py. > > After setting `./pmqos-static.py cpu_dma_latency=0` across our OSD nodes we > saw a conservative 30% increase in backfill and recovery throughput - now > when our main RBD pool of 900+ OSDs is backfilling we expect to see ~22GB/s, > previously that was ~15GB/s. > > We have just got around to opening a case with Red Hat regarding this as at > minimum Ceph should probably be actively using the pmqos interface and tuned > should be setting this with recommendations for the latency-performance > profile in the RHCS install guide. We have done no characterisation of it on > Ubuntu yet, however anecdotally it looks like it has similar issues on the > same hardware. > > Merry xmas. > > Cheers, > Blair > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com