You bet, glad to help. Zillions of small files indeed present a relatively higher metadata overhead, and can be problematic in multiple ways. When using RGW, indexless buckets may be advantageous. Another phenomenon is space amplification — with say a 1 GB file/object, a partially full last allocated block is a trivial amount of wasted space, sometimes called internal fragmentation. As the files get smaller, this becomes an increasingly larger ratio. Mark’s sheet is terrific for visualizing this: https://docs.google.com/spreadsheets/d/1rpGfScgG-GLoIGMJWDixEkqs-On9w8nAUToPQjN8bDI/edit?usp=sharing Work was done a couple of releases ago to allow lowering the default min_alloc_size because of the inefficiency with small RGW objects especially. A subtle additional factor that is often missed is that RADOS writes full stripes, adding another layer of potential incremental wasted space that can be increased by misaligned / larger EC profiles vs replication. > On Feb 25, 2022, at 4:18 AM, Bobby <italienisch1987@xxxxxxxxx> wrote: > > > > thanks Anthony and Janne....exactly what I have been looking for! _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx