Hi, Which version of Ceph are you using? This could be related: http://tracker.ceph.com/issues/9487 See "ReplicatedPG: don't move on to the next snap immediately"; basically, the OSD is getting into a tight loop "trimming" the snapshot objects. The fix above breaks out of that loop more frequently, and then you can use the osd snap trim sleep option to throttle it further. I’m not sure if the fix above will be sufficient if you have many objects to remove per snapshot. That commit is only in giant at the moment. The backport to dumpling is in the dumpling branch but not yet in a release, and firefly is still pending. Cheers, Dan > On 01 Dec 2014, at 10:51, Daniel Schneller <daniel.schneller@xxxxxxxxxxxxxxxx> wrote: > > Hi! > > We take regular (nightly) snapshots of our Rados Gateway Pools for > backup purposes. This allows us - with some manual pokery - to restore > clients' documents should they delete them accidentally. > > The cluster is a 4 server setup with 12x4TB spinning disks each, > totaling about 175TB. We are running firefly. > > We have now completed our first month of snapshots and want to remove > the oldest ones. Unfortunately doing so practically kills everything > else that is using the cluster, because performance drops to almost zero > while the OSDs work their disks 100% (as per iostat). It seems this is > the same phenomenon I asked about some time ago where we were deleting > whole pools. > > I could not find any way to throttle the background deletion activity > (the command returns almost immediately). Here is a graph the I/O > operations waiting (colored by device) while deleting a few snapshots. > Each of the "blocks" in the graph show one snapshot being removed. The > big one in the middle was a snapshot of the .rgw.buckets pool. It took > about 15 minutes during which basically nothing relying on the cluster > was working due to immense slowdowns. This included users getting > kicked off their SSH sessions due to timeouts. > > https://public.centerdevice.de/8c95f1c2-a7c3-457f-83b6-834688e0d048 > > While this is a big issue in itself for us, we would at least try to > estimate how long the process will take per snapshot / per pool. I > assume the time needed is a function of the number of objects that were > modified between two snapshots. We tried to get an idea of at least how > many objects were added/removed in total by running `rados df` with a > snapshot specified as a parameter, but it seems we still always get the > current values: > > $ sudo rados -p .rgw df --snap backup-20141109 > selected snap 13 'backup-20141109' > pool name category KB objects > .rgw - 276165 1368545 > > $ sudo rados -p .rgw df --snap backup-20141124 > selected snap 28 'backup-20141124' > pool name category KB objects > .rgw - 276165 1368546 > > $ sudo rados -p .rgw df > pool name category KB objects > .rgw - 276165 1368547 > > So there are a few questions: > > 1) Is there any way to control how much such an operation will > tax the cluster (we would be happy to have it run longer, if that meant > not utilizing all disks fully during that time)? > > 2) Is there a way to get a decent approximation of how much work > deleting a specific snapshot will entail (in terms of objects, time, > whatever)? > > 3) Would SSD journals help here? Or any other hardware configuration > change for that matter? > > 4) Any other recommendations? We definitely need to remove the data, > not because of a lack of space (at least not at the moment), but because > when customers delete stuff / cancel accounts, we are obliged to remove > their data at least after a reasonable amount of time. > > Cheers, > Daniel > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com