I'm really glad to hear that it wasn't bluestore! :)
It raises another concern though. We didn't expect to see that much of a
slowdown with the current throttle settings. An order of magnitude
slowdown in recovery performance isn't ideal at all.
I wonder if we could improve things dramatically if we kept track of
client IO activity on the OSD and remove the throttle if there's been no
client activity for X seconds. Theoretically more advanced heuristics
might cover this, but in the interim it seems to me like this would
solve the very specific problem you are seeing while still throttling
recovery when IO is happening.
Mark
On 09/14/2017 06:19 AM, Richard Hesketh wrote:
Yeah, that hit the nail on the head. Significantly reducing/eliminating the recovery sleep times increases the recovery speed back up (and beyond!) the levels I was expecting to see - recovery is almost an order of magnitude faster now. Thanks for educating me about those changes!
Rich
On 14/09/17 11:16, Richard Hesketh wrote:
Hi Mark,
No, I wasn't familiar with that work. I am in fact comparing speed of recovery to maintenance work I did while the cluster was in Jewel; I haven't manually done anything to sleep settings, only adjusted max backfills OSD settings. New options that introduce arbitrary slowdown to recovery operations to preserve client performance would explain what I'm seeing! I'll have a tinker with adjusting those values (in my particular case client load on the cluster is very low and I don't have to honour any guarantees about client performance - getting back into HEALTH_OK asap is preferable).
Rich
On 13/09/17 21:14, Mark Nelson wrote:
Hi Richard,
Regarding recovery speed, have you looked through any of Neha's results on recovery sleep testing earlier this summer?
https://www.spinics.net/lists/ceph-devel/msg37665.html
She tested bluestore and filestore under a couple of different scenarios. The gist of it is that time to recover changes pretty dramatically depending on the sleep setting.
I don't recall if you said earlier, but are you comparing filestore and bluestore recovery performance on the same version of ceph with the same sleep settings?
Mark
On 09/12/2017 05:24 AM, Richard Hesketh wrote:
Thanks for the links. That does seem to largely confirm that what I haven't horribly misunderstood anything and I've not been doing anything obviously wrong while converting my disks; there's no point specifying separate WAL/DB partitions if they're going to go on the same device, throw as much space as you have available at the DB partitions and they'll use all the space they can, and significantly reduced I/O on the DB/WAL device compared to Filestore is expected since bluestore's nixed the write amplification as much as possible.
I'm still seeing much reduced recovery speed on my newly Bluestored cluster, but I guess that's a tuning issue rather than evidence of catastrophe.
Rich
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com