On Thu, Jan 19, 2017 at 1:28 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: > Hi Dan, > > I carried out some more testing after doubling the op threads, it may have had a small benefit as potentially some threads are > available, but latency still sits more or less around the configured snap sleep time. Even more threads might help, but I suspect > you are just lowering the chance of IO's that are stuck behind the sleep, rather than actually solving the problem. > > I'm guessing when the snap trimming was in disk thread, you wouldn't have noticed these sleeps, but now it's in the op thread it > will just sit there holding up all IO and be a lot more noticable. It might be that this option shouldn't be used with Jewel+? That's a good thought -- so we need confirmation which thread is doing the snap trimming. I honestly can't figure it out from the code -- hopefully a dev could explain how it works. Otherwise, I don't have much practical experience with snap trimming in jewel yet -- our RBD cluster is still running 0.94.9. Cheers, Dan > >> -----Original Message----- >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Nick Fisk >> Sent: 13 January 2017 20:38 >> To: 'Dan van der Ster' <dan@xxxxxxxxxxxxxx> >> Cc: 'ceph-users' <ceph-users@xxxxxxxxxxxxxx> >> Subject: Re: osd_snap_trim_sleep keeps locks PG during sleep? >> >> We're on Jewel and your right, I'm pretty sure the snap stuff is also now handled in the op thread. >> >> The dump historic ops socket command showed a 10s delay at the "Reached PG" stage, from Greg's response [1], it would suggest >> that the OSD itself isn't blocking but the PG it's currently sleeping whilst trimming. I think in the former case, it would have a > high time >> on the "Started" part of the op? Anyway I will carry out some more testing with higher osd op threads and see if that makes any >> difference. Thanks for the suggestion. >> >> Nick >> >> >> [1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008652.html >> >> > -----Original Message----- >> > From: Dan van der Ster [mailto:dan@xxxxxxxxxxxxxx] >> > Sent: 13 January 2017 10:28 >> > To: Nick Fisk <nick@xxxxxxxxxx> >> > Cc: ceph-users <ceph-users@xxxxxxxxxxxxxx> >> > Subject: Re: osd_snap_trim_sleep keeps locks PG during sleep? >> > >> > Hammer or jewel? I've forgotten which thread pool is handling the snap >> > trim nowadays -- is it the op thread yet? If so, perhaps all the op threads are stuck sleeping? Just a wild guess. (Maybe > increasing # >> op threads would help?). >> > >> > -- Dan >> > >> > >> > On Thu, Jan 12, 2017 at 3:11 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: >> > > Hi, >> > > >> > > I had been testing some higher values with the osd_snap_trim_sleep >> > > variable to try and reduce the impact of removing RBD snapshots on >> > > our cluster and I have come across what I believe to be a possible >> > > unintended consequence. The value of the sleep seems to keep the >> > lock on the PG open so that no other IO can use the PG whilst the snap removal operation is sleeping. >> > > >> > > I had set the variable to 10s to completely minimise the impact as I >> > > had some multi TB snapshots to remove and noticed that suddenly all >> > > IO to the cluster had a latency of roughly 10s as well, all the >> > dumped ops show waiting on PG for 10s as well. >> > > >> > > Is the osd_snap_trim_sleep variable only ever meant to be used up to >> > > say a max of 0.1s and this is a known side effect, or should the >> > > lock on the PG be removed so that normal IO can continue during the >> > sleeps? >> > > >> > > Nick >> > > >> > > _______________________________________________ >> > > ceph-users mailing list >> > > ceph-users@xxxxxxxxxxxxxx >> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com