Xuan Liu recently pointed out that there is a problem with our handling for full clusters/pools: we don't allow any writes when full, including delete operations. While fixing a separate full issue I ended up making several fixes and cleanups in the full handling code in https://github.com/ceph/ceph/pull/6052 The interesting part of that is that we will allow a write as long as it doesn't increase the overall utilizate of bytes or objects (according to the pg stats we're maintaining). That will include remove ops, of cours, but will also allow overwrites while full, which seems fair. However, that's not quite the full story: the client side currently does not send any requests while the full flag is set--it waits until the full flags are cleared before resending things. We can modify things on the client so that it allows ops it knows will succeed (e.g., a simple remove op). However, if there is another op also queued on that object *before* it, we should either block the remove op (to preserve ordering) or discard it when the remove succeeds (on the assumption that any effect it had is now moot). Is the latter option safe? Or, should we do something more clever? Ideally it would be good if other allowed operations are let through, but unfortunately the client doesn't really know enough to tell whether it will/can succeed. e.g., a class "refcount.put" call might result in a deletion (and in fact there is a class that does just that). We could also send all such requests and, if we get ENOSPC, keep them queued and retry when the full flag is cleared. That would require a bit more complexity on the OSD side to preserve ordering, but it's doable... sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html