Hi Cephers,
a while ago Ceph got a PR#15199
(https://github.com/ceph/ceph/pull/15199) that improved object space
usage tracking.
It introduced 'dirty' extent map (in fact - interval set) in an attempt
to track the real space used by the object considering overwrites, holes
etc.
This extent map is kept at OSD level within object_info_t structure that
in turn utilizes object attributes as a persistent storage.
This is a step forward comparing to the initial implementation relied on
maximum object offset.
However IMO there is a couple of significant issues related to this patch:
1) Extent tracking might be pretty expensive. Each write operation
triggers both extent map encoding and its update at object store level.
It takes 16 bytes (offset+length) per single extent that easily results
in submitting several Kb of additional data per write for scattered
objects. For BlueStore this additional data weigh on already overburden
KV store. Moreover this tracking mechanics totally duplicates existing
stuff - at least for BlueStore which tracks 'dirty' extent map on its own.
2) The approach is still inaccurate. It tracks logical space not the
actual physical allocation. E.g. single byte write might result in 4K
physical space allocation.
My suggestion is to refactor this mechanics and force Object Store to
track and report back real object space usage. This looks easily doable
and more accurate for BlueStore. And we can either rollback to the
original implementation or migrate that new PR#15199 approach to FileStore.
What do you think?
Thanks,
Igor
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html