Re: xattrs vs. omap with radosgw

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



After back-porting Sage's patch to Giant, with radosgw, the xattrs can get inline. I haven't run extensive testing yet, will update once I have some performance data to share.

Thanks,
Guang

> Date: Tue, 16 Jun 2015 15:51:44 -0500
> From: mnelson@xxxxxxxxxx
> To: yguang11@xxxxxxxxxxx; sage@xxxxxxxxxxxx
> CC: ceph-devel@xxxxxxxxxxxxxxx; ceph-users@xxxxxxxxxxxxxx
> Subject: Re: xattrs vs. omap with radosgw
>
>
>
> On 06/16/2015 03:48 PM, GuangYang wrote:
> > Thanks Sage for the quick response.
> >
> > It is on Firefly v0.80.4.
> >
> > While trying to put with *rados* directly, the xattrs can be inline. The problem comes to light when using radosgw, since we have a bunch of metadata to keep via xattrs, including:
> > rgw.idtag : 15 bytes
> > rgw.manifest : 381 bytes
>
> Ah, that manifest will push us over the limit afaik resulting in every
> inode getting a new extent.
>
> > rgw.acl : 121 bytes
> > rgw.etag : 33 bytes
> >
> > Given the background, it looks like the problem is that the rgw.manifest is too large so that XFS make it extents. If I understand correctly, if we port the change to Firefly, we should be able to inline the inode since the accumulated size is still less than 2K (please correct me if I am wrong here).
>
> I think you are correct so long as the patch breaks that manifest down
> into 254 byte or smaller chunks.
>
> >
> > Thanks,
> > Guang
> >
> >
> > ----------------------------------------
> >> Date: Tue, 16 Jun 2015 12:43:08 -0700
> >> From: sage@xxxxxxxxxxxx
> >> To: yguang11@xxxxxxxxxxx
> >> CC: ceph-devel@xxxxxxxxxxxxxxx; ceph-users@xxxxxxxxxxxxxx
> >> Subject: Re: xattrs vs. omap with radosgw
> >>
> >> On Tue, 16 Jun 2015, GuangYang wrote:
> >>> Hi Cephers,
> >>> While looking at disk utilization on OSD, I noticed the disk was constantly busy with large number of small writes, further investigation showed that, as radosgw uses xattrs to store metadata (e.g. etag, content-type, etc.), which made the xattrs get from local to extents, which incurred extra I/O.
> >>>
> >>> I would like to check if anybody has experience with offloading the metadata to omap:
> >>> 1> Offload everything to omap? If this is the case, should we make the inode size as 512 (instead of 2k)?
> >>> 2> Partial offload the metadata to omap, e.g. only offloading the rgw specified metadata to omap.
> >>>
> >>> Any sharing is deeply appreciated. Thanks!
> >>
> >> Hi Guang,
> >>
> >> Is this hammer or firefly?
> >>
> >> With hammer the size of object_info_t crossed the 255 byte boundary, which
> >> is the max xattr value that XFS can inline. We've since merged something
> >> that stripes over several small xattrs so that we can keep things inline,
> >> but it hasn't been backported to hammer yet. See
> >> c6cdb4081e366f471b372102905a1192910ab2da. Perhaps this is what you're
> >> seeing?
> >>
> >> I think we're still better off with larger XFS inodes and inline xattrs if
> >> it means we avoid leveldb at all for most objects.
> >>
> >> sage
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux