I do run with osd_max_backfills and osd_recovery_max_active turned up quite a bit from the defaults, I'm trying for as much recovery throughput as possible. I would hazard a guess that the impact seen from the sleep settings is proportionally much smaller if your other recovery-related parameters are more default - but it starts to dominate if you remove other bottlenecks on recovery I/O. Rich On 14/09/17 15:02, Mark Nelson wrote: > I'm really glad to hear that it wasn't bluestore! :) > > It raises another concern though. We didn't expect to see that much of a slowdown with the current throttle settings. An order of magnitude slowdown in recovery performance isn't ideal at all. > > I wonder if we could improve things dramatically if we kept track of client IO activity on the OSD and remove the throttle if there's been no client activity for X seconds. Theoretically more advanced heuristics might cover this, but in the interim it seems to me like this would solve the very specific problem you are seeing while still throttling recovery when IO is happening. > > Mark > > On 09/14/2017 06:19 AM, Richard Hesketh wrote: >> Yeah, that hit the nail on the head. Significantly reducing/eliminating the recovery sleep times increases the recovery speed back up (and beyond!) the levels I was expecting to see - recovery is almost an order of magnitude faster now. Thanks for educating me about those changes! >> >> Rich >> >> On 14/09/17 11:16, Richard Hesketh wrote: >>> Hi Mark, >>> >>> No, I wasn't familiar with that work. I am in fact comparing speed of recovery to maintenance work I did while the cluster was in Jewel; I haven't manually done anything to sleep settings, only adjusted max backfills OSD settings. New options that introduce arbitrary slowdown to recovery operations to preserve client performance would explain what I'm seeing! I'll have a tinker with adjusting those values (in my particular case client load on the cluster is very low and I don't have to honour any guarantees about client performance - getting back into HEALTH_OK asap is preferable). >>> >>> Rich >>> >>> On 13/09/17 21:14, Mark Nelson wrote: >>>> Hi Richard, >>>> >>>> Regarding recovery speed, have you looked through any of Neha's results on recovery sleep testing earlier this summer? >>>> >>>> https://www.spinics.net/lists/ceph-devel/msg37665.html >>>> >>>> She tested bluestore and filestore under a couple of different scenarios. The gist of it is that time to recover changes pretty dramatically depending on the sleep setting. >>>> >>>> I don't recall if you said earlier, but are you comparing filestore and bluestore recovery performance on the same version of ceph with the same sleep settings? >>>> >>>> Mark >>>> >>>> On 09/12/2017 05:24 AM, Richard Hesketh wrote: >>>>> Thanks for the links. That does seem to largely confirm that what I haven't horribly misunderstood anything and I've not been doing anything obviously wrong while converting my disks; there's no point specifying separate WAL/DB partitions if they're going to go on the same device, throw as much space as you have available at the DB partitions and they'll use all the space they can, and significantly reduced I/O on the DB/WAL device compared to Filestore is expected since bluestore's nixed the write amplification as much as possible. >>>>> >>>>> I'm still seeing much reduced recovery speed on my newly Bluestored cluster, but I guess that's a tuning issue rather than evidence of catastrophe. >>>>> >>>>> Rich >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Richard Hesketh Systems Engineer, Research Platforms BBC Research & Development
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com