Thanks Sage for the quick response. It is on Firefly v0.80.4. While trying to put with *rados* directly, the xattrs can be inline. The problem comes to light when using radosgw, since we have a bunch of metadata to keep via xattrs, including: rgw.idtag : 15 bytes rgw.manifest : 381 bytes rgw.acl : 121 bytes rgw.etag : 33 bytes Given the background, it looks like the problem is that the rgw.manifest is too large so that XFS make it extents. If I understand correctly, if we port the change to Firefly, we should be able to inline the inode since the accumulated size is still less than 2K (please correct me if I am wrong here). Thanks, Guang ---------------------------------------- > Date: Tue, 16 Jun 2015 12:43:08 -0700 > From: sage@xxxxxxxxxxxx > To: yguang11@xxxxxxxxxxx > CC: ceph-devel@xxxxxxxxxxxxxxx; ceph-users@xxxxxxxxxxxxxx > Subject: Re: xattrs vs. omap with radosgw > > On Tue, 16 Jun 2015, GuangYang wrote: >> Hi Cephers, >> While looking at disk utilization on OSD, I noticed the disk was constantly busy with large number of small writes, further investigation showed that, as radosgw uses xattrs to store metadata (e.g. etag, content-type, etc.), which made the xattrs get from local to extents, which incurred extra I/O. >> >> I would like to check if anybody has experience with offloading the metadata to omap: >> 1> Offload everything to omap? If this is the case, should we make the inode size as 512 (instead of 2k)? >> 2> Partial offload the metadata to omap, e.g. only offloading the rgw specified metadata to omap. >> >> Any sharing is deeply appreciated. Thanks! > > Hi Guang, > > Is this hammer or firefly? > > With hammer the size of object_info_t crossed the 255 byte boundary, which > is the max xattr value that XFS can inline. We've since merged something > that stripes over several small xattrs so that we can keep things inline, > but it hasn't been backported to hammer yet. See > c6cdb4081e366f471b372102905a1192910ab2da. Perhaps this is what you're > seeing? > > I think we're still better off with larger XFS inodes and inline xattrs if > it means we avoid leveldb at all for most objects. > > sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html