Re: [LSF/MM/BFP TOPIC] Composefs vs erofs+overlay

Alexander Larsson <alexl@xxxxxxxxxx> · Tue, 7 Mar 2023 10:56:28 +0100

On Tue, Mar 7, 2023 at 10:38 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
>
> On 2023/3/7 17:26, Gao Xiang wrote:
> >
> >
> > On 2023/3/7 17:07, Alexander Larsson wrote:
> >> On Tue, Mar 7, 2023 at 9:34 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
> >>>
> >>>
> >>>
> >>> On 2023/3/7 16:21, Alexander Larsson wrote:
> >>>> On Mon, Mar 6, 2023 at 5:17 PM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>>>>> I tested the performance of "ls -lR" on the whole tree of
> >>>>>>> cs9-developer-rootfs.  It seems that the performance of erofs (generated
> >>>>>>> from mkfs.erofs) is slightly better than that of composefs.  While the
> >>>>>>> performance of erofs generated from mkfs.composefs is slightly worse
> >>>>>>> that that of composefs.
> >>>>>>
> >>>>>> I suspect that the reason for the lower performance of mkfs.composefs
> >>>>>> is the added overlay.fs-verity xattr to all the files. It makes the
> >>>>>> image larger, and that means more i/o.
> >>>>>
> >>>>> Actually you could move overlay.fs-verity to EROFS shared xattr area (or
> >>>>> even overlay.redirect but it depends) if needed, which could save some
> >>>>> I/Os for your workloads.
> >>>>>
> >>>>> shared xattrs can be used in this way as well if you care such minor
> >>>>> difference, actually I think inlined xattrs for your workload are just
> >>>>> meaningful for selinux labels and capabilities.
> >>>>
> >>>> Really? Could you expand on this, because I would think it will be
> >>>> sort of the opposite. In my usecase, the erofs fs will be read by
> >>>> overlayfs, which will probably access overlay.* pretty often.  At the
> >>>> very least it will load overlay.metacopy and overlay.redirect for
> >>>> every lookup.
> >>>
> >>> Really.  In that way, it will behave much similiar to composefs on-disk
> >>> arrangement now (in composefs vdata area).
> >>>
> >>> Because in that way, although an extra I/O is needed for verification,
> >>> and it can only happen when actually opening the file (so "ls -lR" is
> >>> not impacted.) But on-disk inodes are more compact.
> >>>
> >>> All EROFS xattrs will be cached in memory so that accessing
> >>> overlay.* pretty often is not greatly impacted due to no real I/Os
> >>> (IOWs, only some CPU time is consumed).
> >>
> >> So, I tried moving the overlay.digest xattr to the shared area, but
> >> actually this made the performance worse for the ls case. I have not
> >
> > That is much strange.  We'd like to open it up if needed.  BTW, did you
> > test EROFS with acl enabled all the time?
> >
> >> looked into the cause in detail, but my guess is that ls looks for the
> >> acl xattr, and such a negative lookup will cause erofs to look at all
> >> the shared xattrs for the inode, which means they all end up being
> >> loaded anyway. Of course, this will only affect ls (or other cases
> >> that read the acl), so its perhaps a bit uncommon.
> >
> > Yeah, in addition to that, I guess real acls could be landed in inlined
> > xattrs as well if exists...
> >
> >>
> >> Did you ever consider putting a bloom filter in the h_reserved area of
> >> erofs_xattr_ibody_header? Then it could return early without i/o
> >> operations for keys that are not set for the inode. Not sure what the
> >> computational cost of that would be though.
> >
> > Good idea!  Let me think about it, but enabling "noacl" mount
> > option isn't prefered if acl is no needed in your use cases.
>
>            ^ is preferred.

That is probably the right approach for the composefs usecase. But
even when you want acls, typically only just a few files have acls
set, so it might be interesting to handle the negative acl lookup case
more efficiently.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                Red Hat, Inc
       alexl@xxxxxxxxxx         alexander.larsson@xxxxxxxxx