On 03/06/2022 16:58, Marek Szyprowski wrote:
Hi Matthew,
On 03.06.2022 17:29, Matthew Wilcox wrote:
On Fri, Jun 03, 2022 at 10:55:01PM +0800, Hsin-Yi Wang wrote:
On Fri, Jun 3, 2022 at 10:10 PM Marek Szyprowski
<m.szyprowski@xxxxxxxxxxx> wrote:
Hi Matthew,
On 03.06.2022 14:59, Matthew Wilcox wrote:
On Fri, Jun 03, 2022 at 02:54:21PM +0200, Marek Szyprowski wrote:
On 01.06.2022 12:39, Hsin-Yi Wang wrote:
Implement readahead callback for squashfs. It will read datablocks
which cover pages in readahead request. For a few cases it will
not mark page as uptodate, including:
- file end is 0.
- zero filled blocks.
- current batch of pages isn't in the same datablock or not enough in a
datablock.
- decompressor error.
Otherwise pages will be marked as uptodate. The unhandled pages will be
updated by readpage later.
Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Signed-off-by: Hsin-Yi Wang <hsinyi@xxxxxxxxxxxx>
Reported-by: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Reported-by: Phillip Lougher <phillip@xxxxxxxxxxxxxxx>
Reported-by: Xiongwei Song <Xiongwei.Song@xxxxxxxxxxxxx>
---
This patch landed recently in linux-next as commit 95f7a26191de
("squashfs: implement readahead"). I've noticed that it causes serious
issues on my test systems (various ARM 32bit and 64bit based boards).
The easiest way to observe is udev timeout 'waiting for /dev to be fully
populated' and prolonged booting time. I'm using squashfs for deploying
kernel modules via initrd. Reverting aeefca9dfae7 & 95f7a26191deon on
top of the next-20220603 fixes the issue.
How large are these files? Just a few kilobytes?
Yes, they are small, most of them are smaller than 16KB, some about
128KB and a few about 256KB. I've sent a detailed list in private mail.
Hi Marek,
Are there any obvious squashfs errors in dmesg? Did you enable
CONFIG_SQUASHFS_FILE_DIRECT or CONFIG_SQUASHFS_FILE_CACHE?
I don't think it's an error problem. I think it's a short file problem.
As I understand the current code (and apologies for not keeping up
to date with how the patch is progressing), if the file is less than
msblk->block_size bytes, we'll leave all the pages as !uptodate, leaving
them to be brough uptodate by squashfs_read_folio(). So Marek is hitting
the worst case scenario where we re-read the entire block for each page
in it. I think we have to handle this tail case in ->readahead().
I'm not sure if this is related to reading of small files. There are
only 50 modules being loaded from squashfs volume. I did a quick test of
reading the files.
Simple file read with this patch:
root@target:~# time find /initrd/ -type f | while read f; do cat $f
>/dev/null; done
real 0m5.865s
user 0m2.362s
sys 0m3.844s
Without:
root@target:~# time find /initrd/ -type f | while read f; do cat $f
>/dev/null; done
real 0m6.619s
user 0m2.112s
sys 0m4.827s
It has been a four day holiday in the UK (Queen's Platinum Jubilee),
hence the delay in responding.
The above read use-case is sequential (only one thread/process),
whereas the use-case where the slow-down is observed may be
parallel (multiple threads/processes entering Squashfs).
The above sequential use-case if the small files are held in
fragments, will be exhibiting caching behaviour that will
ameliorate the case where the same block is being repeatedly
re-read for each page in it. Because each time
Squashfs is re-entered handling only a single page, the
decompressed block will be found in the fragment
cache, eliminating a block decompression for each page.
In a parallel use-case the decompressed fragment block
may be being eliminated from the cache (by other reading
processes), hence forcing the block to be repeatedly
decompressed.
Hence the slow-down will be much more noticable with a
parallel use-case than a sequential use-case. It also may
be why this slipped through testing, if the test cases
are purely sequential in nature.
So Matthew's previous comment is still the most likely
explanation for the slow-down.
Phillip
Best regards