Here's an update: Tried running 3.18.0-rc5 over the weekend to no avail. A load spike through Ceph brings no perceived improvement over the chassis running 3.10 kernels. Here is a graph of *system* cpu time (not user), note that 3.18 was a005.block: http://ponies.io/raw/cluster.png It is perhaps faring a little better that those chassis running the 3.10 in that it did not have min_free_kbytes raised to 2GB as the others did, instead it was sitting around 90MB. The perf recording did look a little different. Not sure if this was just the luck of the draw in how the fractal rendering works: http://ponies.io/raw/perf-3.10.png Any pointers on how we can track this down? There's at least three of us following at this now so we should have plenty of area to test.
Attachment:
pgpjDq2diufIK.pgp
Description: PGP signature