FYI in my test I used osd_max_backfills = 10 which is hammer default. Post hammer it's been changed to 1. Thanks & Regards Somnath -----Original Message----- From: Christian Balzer [mailto:chibi@xxxxxxx] Sent: Thursday, May 12, 2016 10:40 PM To: Scottix Cc: Somnath Roy; ceph-users@xxxxxxxxxxxxxx; Nick Fisk Subject: Re: Weighted Priority Queue testing Hello, On Thu, 12 May 2016 15:41:13 +0000 Scottix wrote: > We have run into this same scenarios in terms of the long tail taking > much longer on recovery than the initial. > > Either time we are adding osd or an osd get taken down. At first we > have max-backfill set to 1 so it doesn't kill the cluster with io. As > time passes by the single osd is performing the backfill. So we are > gradually increasing the max-backfill up to 10 to reduce the amount of > time it needs to recover fully. I know there are a few other factors > at play here but for us we tend to do this procedure every time. > Yeah, as I wrote in my original mail "This becomes even more obvious when backfills and recovery settings are lowered". However my test cluster is at the default values, so it starts with a (much too big) bang and ends with a whimper, not because it's throttled but simply because there are so few PGs/OSDs to choose from. Or so it seems, purely from observation. Christian > On Wed, May 11, 2016 at 6:29 PM Christian Balzer <chibi@xxxxxxx> wrote: > > > On Wed, 11 May 2016 16:10:06 +0000 Somnath Roy wrote: > > > > > I bumped up the backfill/recovery settings to match up Hammer. It > > > is probably unlikely that long tail latency is a parallelism > > > issue. If so, entire recovery would be suffering not the tail > > > alone. It's probably a prioritization issue. Will start looking > > > and update my findings. I can't add devl because of the table but > > > needed to add community that's why ceph-users :-).. Also, wanted > > > to know from Ceph's user if they are also facing similar issues.. > > > > > > > What I meant with lack of parallelism is that at the start of a > > rebuild, there are likely to be many candidate PGs for recovery and > > backfilling, so many things happen at the same time, up to the > > limits of what is configured (max backfill etc). > > > > From looking at my test cluster, it starts with 8-10 backfills and > > recoveries (out of 140 affected PGs), but later on in the game there > > are less and less PGs (and OSDs/nodes) to choose from, so things > > slow down around 60 PGs to just 3-4 backfills. > > And around 20 PGs it's down to 1-2 backfills, so the parallelism is > > clearly gone at that point and recovery speed is down to what a > > single PG/OSD can handle. > > > > Christian > > > > > Thanks & Regards > > > Somnath > > > > > > -----Original Message----- > > > From: Christian Balzer [mailto:chibi@xxxxxxx] > > > Sent: Wednesday, May 11, 2016 12:31 AM > > > To: Somnath Roy > > > Cc: Mark Nelson; Nick Fisk; ceph-users@xxxxxxxxxxxxxx > > > Subject: Re: Weighted Priority Queue testing > > > > > > > > > > > > Hello, > > > > > > not sure if the Cc: to the users ML was intentional or not, but > > > either way. > > > > > > The issue seen in the tracker: > > > http://tracker.ceph.com/issues/15763 > > > and what you have seen (and I as well) feels a lot like the lack > > > of parallelism towards the end of rebuilds. > > > > > > This becomes even more obvious when backfills and recovery > > > settings are lowered. > > > > > > Regards, > > > > > > Christian > > > -- > > > Christian Balzer Network/Systems Engineer > > > chibi@xxxxxxx Global OnLine Japan/Rakuten Communications > > > http://www.gol.com/ > > > PLEASE NOTE: The information contained in this electronic mail > > > message is intended only for the use of the designated > > > recipient(s) named above. If the reader of this message is not the > > > intended recipient, you are hereby notified that you have received > > > this message in error and that any review, dissemination, > > > distribution, or copying of this message is strictly prohibited. > > > If you have received this communication in error, please notify > > > the sender by telephone or e-mail (as shown above) immediately and > > > destroy any and all copies of this message in your possession > > > (whether hard copies or electronically stored copies). > > > > > > > > > -- > > Christian Balzer Network/Systems Engineer > > chibi@xxxxxxx Global OnLine Japan/Rakuten Communications > > http://www.gol.com/ > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com