On Thu, Sep 24, 2015 at 8:04 AM, Sage Weil <sweil@xxxxxxxxxx> wrote: > On Thu, 24 Sep 2015, Robert LeBlanc wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA256 >> >> >> On Thu, Sep 24, 2015 at 6:30 AM, Sage Weil wrote: >> > Xuan Liu recently pointed out that there is a problem with our handling >> > for full clusters/pools: we don't allow any writes when full, >> > including delete operations. >> > >> > While fixing a separate full issue I ended up making several fixes and >> > cleanups in the full handling code in >> > >> > https://github.com/ceph/ceph/pull/6052 >> > >> > The interesting part of that is that we will allow a write as long as it >> > doesn't increase the overall utilizate of bytes or objects (according to >> > the pg stats we're maintaining). That will include remove ops, of cours, >> > but will also allow overwrites while full, which seems fair. >> >> What about overwrites on a COW FS, won't that still increase used >> space? Maybe if it is a COW FS, don't allow overwrites? > > Yeah, we could strengthen (optionally?) the check so that only operations > that result in a net decrease are allowed.. It's not just COW filesystems, anything that modifies leveldb/rocksdb/whatever in any way will also increase the space used — including regular object deletes which additionally get added to the PG log, although *hopefully* that's not a problem since we have our extra buffers to handle this sort of thing. While right now we might have some hope of being able to tag ops as "net deletes" or "net adds", I don't see that happening once we have widespread third-party object classes or that Lua work gets in or something... So, I'd be really leery of trying to do anything more advanced than letting clients execute delete operations, and letting privileged clients keep doing real work. (Or maybe restricting it entirely to the second half there.) That latter switch already exists, by the way, although I don't think it's actually enforced via cephx caps (it should be) — the Objecter has an honor_full_flag setting which the MDS sets to false. I don't think the library interfaces are there to specify it per-op but IIRC it is part of the data sent to OSDs so it wouldn't require a wire protocol change. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html