Re: Speeding up garbage collection in RGW

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think if you want to delete through gc,
increase this
OPTION(rgw_gc_processor_max_time, OPT_INT, 3600)  // total run time
for a single gc processor work
decrease this
OPTION(rgw_gc_processor_period, OPT_INT, 3600)  // gc processor cycle time

Or , I think if there is some option to bypass the gc


On Tue, Jul 25, 2017 at 5:05 AM, Bryan Stillwell <bstillwell@xxxxxxxxxxx> wrote:
> Wouldn't doing it that way cause problems since references to the objects wouldn't be getting removed from .rgw.buckets.index?
>
> Bryan
>
> From: Roger Brown <rogerpbrown@xxxxxxxxx>
> Date: Monday, July 24, 2017 at 2:43 PM
> To: Bryan Stillwell <bstillwell@xxxxxxxxxxx>, "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>
> Subject: Re:  Speeding up garbage collection in RGW
>
> I hope someone else can answer your question better, but in my case I found something like this helpful to delete objects faster than I could through the gateway:
>
> rados -p default.rgw.buckets.data ls | grep 'replace this with pattern matching files you want to delete' | xargs -d '\n' -n 200 rados -p default.rgw.buckets.data rm
>
>
> On Mon, Jul 24, 2017 at 2:02 PM Bryan Stillwell <bstillwell@xxxxxxxxxxx> wrote:
> I'm in the process of cleaning up a test that an internal customer did on our production cluster that produced over a billion objects spread across 6000 buckets.  So far I've been removing the buckets like this:
>
> printf %s\\n bucket{1..6000} | xargs -I{} -n 1 -P 32 radosgw-admin bucket rm --bucket={} --purge-objects
>
> However, the disk usage doesn't seem to be getting reduced at the same rate the objects are being removed.  From what I can tell a large number of the objects are waiting for garbage collection.
>
> When I first read the docs it sounded like the garbage collector would only remove 32 objects every hour, but after looking through the logs I'm seeing about 55,000 objects removed every hour.  That's about 1.3 million a day, so at this rate it'll take a couple years to clean up the rest!  For comparison, the purge-objects command above is removing (but not GC'ing) about 30 million objects a day, so a much more manageable 33 days to finish.
>
> I've done some digging and it appears like I should be changing these configuration options:
>
> rgw gc max objs (default: 32)
> rgw gc obj min wait (default: 7200)
> rgw gc processor max time (default: 3600)
> rgw gc processor period (default: 3600)
>
> A few questions I have though are:
>
> Should 'rgw gc processor max time' and 'rgw gc processor period' always be set to the same value?
>
> Which would be better, increasing 'rgw gc max objs' to something like 1024, or reducing the 'rgw gc processor' times to something like 60 seconds?
>
> Any other guidance on the best way to adjust these values?
>
> Thanks,
> Bryan
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux