Re: [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem

Alexander Larsson <alexl@xxxxxxxxxx> · Wed, 25 Jan 2023 11:15:59 +0100

On Wed, 2023-01-25 at 18:05 +0800, Gao Xiang wrote:
> 
> 
> On 2023/1/25 17:37, Alexander Larsson wrote:
> > On Tue, 2023-01-24 at 21:06 +0200, Amir Goldstein wrote:
> > > On Tue, Jan 24, 2023 at 3:13 PM Alexander Larsson
> > > <alexl@xxxxxxxxxx>
> 
> ...
> 
> > > > 
> > > > They are all strictly worse than squashfs in the above testing.
> > > > 
> > > 
> > > It's interesting to know why and if an optimized mkfs.erofs
> > > mkfs.ext4 would have done any improvement.
> > 
> > Even the non-loopback mounted (direct xfs backed) version performed
> > worse than the squashfs one. I'm sure a erofs with sparse files
> > would
> > do better due to a more compact file, but I don't really see how it
> > would perform significantly different than the squashfs code. Yes,
> > squashfs lookup is linear in directory length, while erofs is
> > log(n),
> > but the directories are not so huge that this would dominate the
> > runtime.
> > 
> > To get an estimate of this I made a broken version of the erofs
> > image,
> > where the metacopy files are actually 0 byte size rather than
> > sparse.
> > This made the erofs file 18M instead, and gained 10% in the cold
> > cache
> > case. This, while good, is not near enough to matter compared to
> > the
> > others.
> > 
> > I don't think the base performance here is really much dependent on
> > the
> > backing filesystem. An ls -lR workload is just a measurement of the
> > actual (i.e. non-dcache) performance of the filesystem
> > implementation
> > of lookup and iterate, and overlayfs just has more work to do here,
> > especially in terms of the amount of i/o needed.
> 
> I will form a formal mkfs.erofs version in one or two days since
> we're
> cerebrating Lunar New year now.
> 
> Since you don't have more I/O traces for analysis, I have to do
> another
> wild guess.
> 
> Could you help benchmark your v2 too? I'm not sure if such
> performance also exists in v2.  The reason why I guess as this is
> that it seems that you read all dir inode pages when doing the first
> lookup, it can benefit to seq dir access.
> 
> I'm not sure if EROFS can make a similar number by doing forcing
> readahead on dirs to read all dir data at once as well.
> 
> Apart from that I don't see significant difference, at least
> personally
> I'd like to know where it could have such huge difference.  I don't
> think that is all because of read-only on-disk format differnce.

I think the performance difference between v2 and v3 would be rather
minor in this case, because I don't think a lot of the directories are
large enough to be split in chunks. I also don't believe erofs and
composefs should fundamentally differ much in performance here, given
that both use a compact binary searchable layout for dirents. However,
the full comparison is "composefs" vs "overlayfs + erofs", and in that
case composefs wins.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
=-=-=
 Alexander Larsson                                            Red Hat,
Inc 
       alexl@xxxxxxxxxx            alexander.larsson@xxxxxxxxx 
He's an obese Catholic messiah who knows the secret of the alien 
invasion. She's a provocative Bolivian single mother living on borrowed
time. They fight crime!