Re: [RFC PATCH 03/24] erofs: add Errno in Rust

Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> · Thu, 26 Sep 2024 19:23:26 +0800

On 2024/9/26 19:01, Ariel Miculas via Linux-erofs wrote:
On 24/09/26 06:46, Gao Xiang wrote:

...

	                Total Size (MiB)	Average layer size (MiB)	Saved / 766.1MiB
Compressed OCI (tar.gz)	282.5	28.3	63%
Uncompressed OCI (tar)	766.1	76.6	0%
Uncomprssed EROFS	109.5	11.0	86%
EROFS (DEFLATE,9,32k)	46.4	4.6	94%
EROFS (LZ4HC,12,64k)	54.2	5.4	93%

I don't know which compression algorithm are you using (maybe Zstd?),
but from the result is
    EROFS (LZ4HC,12,64k)  54.2
    PuzzleFS compressed   53?
    EROFS (DEFLATE,9,32k) 46.4

I could reran with EROFS + Zstd, but it should be smaller. This feature
has been supported since Linux 6.1, thanks.

The average layer size is very impressive for EROFS, great work.
However, if we multiply the average layer size by 10, we get the total
size (5.4 MiB * 10 ~ 54.2 MiB), whereas for PuzzleFS, we see that while
the average layer size is 30 MIB (for the compressed case), the unified
size is only 53 MiB. So this tells me there's blob sharing between the
different versions of Ubuntu Jammy with PuzzleFS, but there's no sharing
with EROFS (what I'm talking about is deduplication across the multiple
versions of Ubuntu Jammy and not within one single version).

Don't make me wrong, I don't think you got the point.

First, what you asked was `I'm referring specifically to this
comment: "EROFS already supports variable-sized chunks + CDC"`,
so I clearly answered with the result of compressed data global
deduplication with CDC.

Here both EROFS and Squashfs compresses 10 Ubuntu images into
one image for fair comparsion to show the benefit of CDC, so

It might be a fair comparison, but that's not how container images are
distributed. You're trying to argue that I should just use EROFS and I'm

First, OCI layer is just distributed like what I said.

For example, I could introduce some common blobs to keep
chunks as chunk dictionary.   And then the each image
will be just some index, and all data will be
deduplicated.  That is also what Nydus works.

showing you that EROFS doesn't currently support the functionality
provided by PuzzleFS: the deduplication across multiple images.

No, EROFS supports external devices/blobs to keep a lot of
chunks too (as dictionary to share data among images), but
clearly it has the upper limit.

But PuzzleFS just treat each individual chunk as a seperate
file, that will cause unavoidable "open arbitary number of
files on reading, even in page fault context".

I believe they basically equal to your `Unified size`s, so
the result is

			Your unified size
	EROFS (LZ4HC,12,64k)  54.2
	PuzzleFS compressed   53?
	EROFS (DEFLATE,9,32k) 46.4

That is why I used your 53 unified size to show EROFS is much
smaller than PuzzleFS.

The reason why EROFS and SquashFS doesn't have the `Total Size`s
is just because we cannot store every individual chunk into some
seperate file.

Well storing individual chunks into separate files is the entire point
of PuzzleFS.

Currently, I have seen no reason to open arbitary kernel files
(maybe hundreds due to large folio feature at once) in the page
fault context.  If I modified `mkfs.erofs` tool, I could give
some similar numbers, but I don't want to waste time now due
to `open arbitary kernel files in the page fault context`.

As I said, if PuzzleFS finally upstream some work to open kernel
files in page fault context, I will definitely work out the same
feature for EROFS soon, but currently I don't do that just
because it's very controversal and no in-tree kernel filesystem
does that.

The PuzzleFS kernel filesystem driver is still in an early POC stage, so
there's still a lot more work to be done.

I suggest that you could just ask FS/MM folks about this ("open
kernel files when reading in the page fault") first.

If they say "no", I suggest please don't waste on this anymore.

Thanks,
Gao Xiang