Did you try correlating it with PG scrubbing or other maintenance behaviors? -Greg On Thu, May 11, 2017 at 12:47 PM, Patrick Dinnen <pdinnen@xxxxxxxxx> wrote: > Seeing some odd behaviour while testing using rados bench. This is on > a pre-split pool, two node cluster with 12 OSDs total. > > ceph osd pool create newerpoolofhopes 2048 2048 replicated "" > replicated_ruleset 500000000 > > rados -p newerpoolofhopes bench -t 32 -b 20000 30000000 write --no-cleanup > > Using Prometheus/Grafana to watch what's going on, we see oddly > regular peaks and dips in writer performance. The frequency changes > gradually but it's on the order of hours (not the seconds that might > seem easier to explain by system phenomena). It starts off at roughly > one cycle per hour and we've seen it for multiple days of constant > bench running with nothing else happening on the cluster. > > A bunch of graphs showing the pattern: > > https://ibb.co/djXUVk > https://ibb.co/gMNk35 > https://ibb.co/iKViqk > https://ibb.co/jOXJO5 > https://ibb.co/isUMbQ > > sdg and sdi are SSD journal disks. The activity on the OSDs and SSDs > seems anti-correlated. SSDs peak in activity as OSDs reach the bottom > of the trough. Then the reverse. Repeat. > > Does anyone have any suggestions as to what could possibly be causing > a regular pattern like this at such a low frequency? > > Thanks, Patrick Dinnen > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com