Hi,
On 04/02/15 19:31, Stillwell, Bryan wrote:
Recovery creates I/O performance drops in our VM too but it's manageable. What really hurts us are deep scrubs. Our current situation is Firefly 0.80.9 with a total of 24 identical OSDs evenly distributed on 4 servers with the following relevant configuration: osd recovery max active = 2 osd scrub load threshold = 3 osd deep scrub interval = 1209600 # 14 days osd max backfills = 4 osd disk thread ioprio class = idle osd disk thread ioprio priority = 7 we managed to add several OSDs at once while deep scrubs were in practice disabled: we just increased deep scrub interval from 1 to 2 weeks which if I understand correctly had the effect of disabling them for 1 week (and indeed there were none while the backfilling went on for several hours). With these settings and no deep-scrubs the load increased a bit in the VMs doing non negligible I/Os but this was manageable. Even disk thread ioprio settings (which is what you want to get the ionice behaviour for deep scrubs) didn't seem to make much of a difference. Note : I don't believe Ceph will try to scatter the scrubs on the whole period you set with deep scrub interval, it seems its algorithm is much simpler than that and may lead to temporary salves of successive deep scrubs and it might generate some temporary I/O load which is hard to diagnose (by default scrubs and deep scrubs are logged by the OSD so you can correlate them with whatever you use to supervise your cluster). I actually considered monitoring Ceph for backfills and using ceph set nodeep-scrub automatically when there are some and unset it when they disappear. Best regards, Lionel Bouton |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com