Re: full cluster/pool handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 24, 2015 at 5:30 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> Xuan Liu recently pointed out that there is a problem with our handling
> for full clusters/pools: we don't allow any writes when full,
> including delete operations.
>
> While fixing a separate full issue I ended up making several fixes and
> cleanups in the full handling code in
>
>         https://github.com/ceph/ceph/pull/6052
>
> The interesting part of that is that we will allow a write as long as it
> doesn't increase the overall utilizate of bytes or objects (according to
> the pg stats we're maintaining).  That will include remove ops, of cours,
> but will also allow overwrites while full, which seems fair.
>
> However, that's not quite the full story: the client side currently
> does not send any requests while the full flag is set--it waits until the
> full flags are cleared before resending things.
>
> We can modify things on the client so that it allows ops it knows will
> succeed (e.g., a simple remove op).  However, if there is another op also
> queued on that object *before* it, we should either block the remove op
> (to preserve ordering) or discard it when the remove succeeds (on the
> assumption that any effect it had is now moot).

What if it was a compound operation that truncates and writes?

>
> Is the latter option safe?
>
> Or, should we do something more clever?  Ideally it would be good if other
> allowed operations are let through, but unfortunately the client doesn't
> really know enough to tell whether it will/can succeed.  e.g., a class
> "refcount.put" call might result in a deletion (and in fact there is a
> class that does just that).  We could also send all such requests and, if

rgw (tail) object removals are using this objclass.

> we get ENOSPC, keep them queued and retry when the full flag is cleared.
> That would require a bit more complexity on the OSD side to preserve
> ordering, but it's doable...
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux