Which version of Ceph are you running right now and seeing this with (Sam reworked it a bit for Cuttlefish and it was in some of the dev releases)? Snapshot deletes are a little more expensive than we'd like, but I'm surprised they're doing this badly for you. :/ -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Apr 21, 2013 at 2:16 AM, Olivier Bonvalet <olivier.bonvalet@xxxxxxxxx> wrote: > Hi, > > I have a backup script, which every night : > * create a snapshot of each RBD image > * then delete all snapshot that have more than 15 days > > The problem is that "rbd snap rm XXX" will overload my cluster for hours > (6 hours today...). > > Here I see several problems : > #1 "rbd snap rm XXX" is not blocking. The erase is done in background, > and I know no way to verify if it was completed. So I add "sleeps" > between rm, but I have to estimate the time it will take > > #2 "rbd (snap) rm" are sometimes very very slow. I don't know if it's > because of XFS or not, but all my OSD are at 100% IO usage (reported by > iostat) > > > > So : > * is there a way to reduce priority of "snap rm", to avoid overloading > of the cluster ? > * is there a way to have a blocking "snap rm" which will wait until it's > completed > * is there a way to speedup "snap rm" ? > > > Note that I have a too low PG number on my cluster (200 PG for 40 active > OSD ; but I'm trying to progressivly migrate data to a newer pool). Can > it be the source of the problem ? > > Thanks, > > Olivier > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com