Re: efficient removal of old objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 31, 2012 at 4:33 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> Currently rgw logs objects it wants to delete after some period of time,
> and an radosgw-admin command comes back later to process the log.  It
> works, but is currently slow (one sync op at a time).

Intent log generation doesn't come free of charge, it adds some load
on the system.

>
> A better approach would be to mark objects for later removal, and have the
> OSD do it in some more efficient way.  wip-objs-expire has a client side
> (librados) interface for this.

Note that setting expiration on an object is a more lightweight
operation than appending the intent log, as it would be done as a sub
op in the compound operation that created the object.

>
> I think there are a couple questions:
>
> Should this be generalized to saying "do these osd ops at time X" instead
> of "delete at time X".  Then it could setxattr, remove, call into a class,
> whatever.

While I think it'd make a nice feature, I also think that the problem
space of a garbage collection is a bit different, and given the time
constraints it wouldn't make sense implementing this right now anyway.
>
> How would the OSD implement this?  A kludgey way would be to do it during
> scrub.  The current scrub implementation may make that problematic because
> it does a whole PG at time, and we probably don't want to issue a whole
> PG's worth of deletes at a time.  Is there a way to make that less
> painful?

If we need to lock the entire pg while removing the objects it wouldn't work.
I'm not too familiar with the scrub code, and I don't want to dive
here into possible implementation details, but getting the scrub to
generate a list of objects for removal may be possible.

>
> Not using scrub means we need some sort of index to keep track of objects
> with delayed events.  Using a collection for this might work, but loading
> all this state into memory would be slow if there were too many events
> registered.
>
> Given all that, and that we need a solution to the expiration soon
> (weeks), do we
>  - do a complete solution now,
>  - parallelize radosgw-admin log processing,
>  - or hack it into scrub?
>
I don't expect to see many hands going up for "hacking" anything. I
would argue that having a garbage collection related job going on
inside a maintenance activity is not that far fetched. Not at any cost
though.

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux