in the rgw refactoring meetings, we've been discussing ways to improve space utilization for workloads of mixed object sizes i think it's worth bring this up in Mark's performance call as well, to explore other options from the osd/librados perspective most of our discussion so far has centered around ways to use s3's storage classes (which rgw maps to different rados pools) as a way to direct object uploads to an appropriately-configured pool depending on the object's size. for example, all objects under 1M would be assigned to a SMALL storage class, while the rest go to LARGE. doing this directly is tricky, because http requests don't always tell us the full object size up front. this strategy could also lead to confusion in s3 applications, because the storage class is a visible part of the protocol and clients expect to have control over it you can read more about storage classes and rgw pool placement in https://docs.ceph.com/en/latest/radosgw/placement/. essentially, each bucket chooses a 'placement target' on creation, and that placement target defines which storage classes are available for its object uploads. each storage class defines the rados pool to use for the object data. each placement target has a default storage class called STANDARD which is used for object uploads that don't specify a storage class. this STANDARD pool is also used to store all of the bucket's head objects, regardless of their storage class. objects uploaded to the STANDARD storage class store up to 4MB of data in the head object, and the rest in tail objects of the same pool. objects uploaded to other storage classes only store metadata in the head object - all of their data goes in tail objects in their own pool in today's call, Yehuda made the observation that for this use case, it would be ideal to put all head objects in a pool with small min_alloc_size and all tails in larger-sized pools. this way, even though we don't necessarily know the full object size up front, we'd still place all small objects in the correctly-sized pool, with larger objects spilling over into their own tail pools this doesn't quite match up with our existing implementation though, because we put the STANDARD storage class' tail objects in the same pool as the head objects, and other storage classes only store data in the tails so i suggested an additional option to specify a 'head object pool' in the placement target that's independent of its storage classes. when specified, all head objects would be written to that pool instead, along with a configurable amount of data. benefits of this strategy would be that it preserves the storage class behavior that clients expect, and enables an optional configuration for a space-optimized head object pool _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx