Re: Odd cyclical cluster performance

Gregory Farnum <gfarnum@xxxxxxxxxx> · Mon, 15 May 2017 14:51:05 -0700



Did you try correlating it with PG scrubbing or other maintenance behaviors?
-Greg

On Thu, May 11, 2017 at 12:47 PM, Patrick Dinnen <pdinnen@xxxxxxxxx> wrote:
> Seeing some odd behaviour while testing using rados bench. This is on
> a pre-split pool, two node cluster with 12 OSDs total.
>
> ceph osd pool create newerpoolofhopes 2048 2048 replicated ""
> replicated_ruleset 500000000
>
> rados -p newerpoolofhopes bench -t 32 -b 20000 30000000 write --no-cleanup
>
> Using Prometheus/Grafana to watch what's going on, we see oddly
> regular peaks and dips in writer performance. The frequency changes
> gradually but it's on the order of hours (not the seconds that might
> seem easier to explain by system phenomena). It starts off at roughly
> one cycle per hour and we've seen it for multiple days of constant
> bench running with nothing else happening on the cluster.
>
> A bunch of graphs showing the pattern:
>
> https://ibb.co/djXUVk
> https://ibb.co/gMNk35
> https://ibb.co/iKViqk
> https://ibb.co/jOXJO5
> https://ibb.co/isUMbQ
>
> sdg and sdi are SSD journal disks. The activity on the OSDs and SSDs
> seems anti-correlated. SSDs peak in activity as OSDs reach the bottom
> of the trough. Then the reverse. Repeat.
>
> Does anyone have any suggestions as to what could possibly be causing
> a regular pattern like this at such a low frequency?
>
> Thanks, Patrick Dinnen
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com