osd_snap_trim_sleep keeps locks PG during sleep?

Nick Fisk <nick@xxxxxxxxxx> · Thu, 12 Jan 2017 14:11:39 -0000

Hi,

I had been testing some higher values with the osd_snap_trim_sleep variable to try and reduce the impact of removing RBD snapshots
on our cluster and I have come across what I believe to be a possible unintended consequence. The value of the sleep seems to keep
the lock on the PG open so that no other IO can use the PG whilst the snap removal operation is sleeping.

I had set the variable to 10s to completely minimise the impact as I had some multi TB snapshots to remove and noticed that suddenly
all IO to the cluster had a latency of roughly 10s as well, all the dumped ops show waiting on PG for 10s as well.

Is the osd_snap_trim_sleep variable only ever meant to be used up to say a max of 0.1s and this is a known side effect, or should
the lock on the PG be removed so that normal IO can continue during the sleeps?

Nick

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com