On 06/01/15 09:43, Jan Schermer wrote: > We had to disable deep scrub or the cluster would me unusable - we need to turn it back on sooner or later, though. > With minimal scrubbing and recovery settings, everything is mostly good. Turned out many issues we had were due to too few PGs - once we increased them from 4K to 16K everything sped up nicely (because the chunks are smaller), but during heavy activity we are still getting some “slow IOs”. > I believe there is an ionice knob in newer versions (we still run Dumpling), and that should do the trick no matter how much additional “load” is put on the OSDs. > Everybody’s bottleneck will be different - we run all flash so disk IO is not a problem but an OSD daemon is - no ionice setting will help with that, it just needs to be faster ;-) If you are interested I'm currently testing a ruby script which schedules the deep scrubs one at a time trying to simultaneously make them fit in a given time window, avoid successive scrubs on the same OSD and space the deep scrubs according to the amount of data scrubed. I use it because Ceph by itself can't prevent multiple scrubs to happen simultaneously on the network and it can severely impact our VM performance. I can clean it up and post it on Github. Best regards, Lionel _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com