Re: squashfs performance regression and readahea

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, May 15, 2022 at 8:55 AM Phillip Lougher <phillip@xxxxxxxxxxxxxxx> wrote:
>
> On 13/05/2022 07:35, Hsin-Yi Wang wrote:
> > On Fri, May 13, 2022 at 1:33 PM Phillip Lougher <phillip@xxxxxxxxxxxxxxx> wrote:
> >>
> >> My understanding is that this call will fully populate the
> >> pages array with page references without any holes.  That
> >> is none of the pages array entries will be NULL, meaning
> >> there isn't a page for that entry.  In other words, if the
> >> pages array has 32 pages, each of the 32 entries will
> >> reference a page.
> >>
> > I noticed that if nr_pages < max_pages, calling read_blocklist() will
> > have SQUASHFS errors,
> >
> > SQUASHFS error: Failed to read block 0x125ef7d: -5
> > SQUASHFS error: zlib decompression failed, data probably corrupt
> >
> > so I did a check if nr_pages < max_pages before squashfs_read_data(),
> > just skip the remaining pages and let them be handled by readpage.
> >
>
> Yes that avoids passing the decompressor code a too small page range.
> As such extending the decompressor code isn't necessary.
>
> Testing your patch I discovered a number of cases where
> the decompressor still failed as above.
>
> This I traced to "sparse blocks", these are zero filled blocks, and
> are indicated/stored as a block length of 0 (bsize == 0).  Skipping
> this sparse block and letting it be handled by readpage fixes this
> issue.
>
Ack. Thanks for testing this.

> I also noticed a potential performance improvement.  You check for
> "pages[nr_pages - 1]->index >> shift) == index" after calling
> squashfs_read_data.  But this information is known before
> calling squashfs_read_data and moving the check to before
> squashfs_read_data saves the cost of doing a redundant block
> decompression.
>
After applying this, The performance becomes:
2.73s
2.76s
2.73s

Original:
2.76s
2.79s
2.77s

(The pack file is different from my previous testing in this email thread.)

> Finally I noticed that if nr_pages grows after the __readahead_batch
> call, then the pages array and the page actor will be too small, and
> it will cause the decompressor to fail.  Changing the allocation to
> max_pages fixes this.
>
Ack.

I've added the fixes patch and previous fixes.
> I have rolled these fixes into the patch below (also attached in
> case it gets garbled).
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index 7cd57e0d88de..14485a7af5cf 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -518,13 +518,11 @@ static void squashfs_readahead(struct
> readahead_control *ractl)
>             file_end == 0)
>                 return;
>
> -       nr_pages = min(readahead_count(ractl), max_pages);
> -
> -       pages = kmalloc_array(nr_pages, sizeof(void *), GFP_KERNEL);
> +       pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL);
>         if (!pages)
>                 return;
>
> -       actor = squashfs_page_actor_init_special(pages, nr_pages, 0);
> +       actor = squashfs_page_actor_init_special(pages, max_pages, 0);
>         if (!actor)
>                 goto out;
>
> @@ -538,11 +536,18 @@ static void squashfs_readahead(struct
> readahead_control *ractl)
>                         goto skip_pages;
>
>                 index = pages[0]->index >> shift;
> +
> +               if ((pages[nr_pages - 1]->index >> shift) != index)
> +                       goto skip_pages;
> +
>                 bsize = read_blocklist(inode, index, &block);
> +               if (bsize == 0)
> +                       goto skip_pages;
> +
>                 res = squashfs_read_data(inode->i_sb, block, bsize, NULL,
>                                          actor);
>
> -               if (res >= 0 && (pages[nr_pages - 1]->index >> shift) == index)
> +               if (res >= 0)
>                         for (i = 0; i < nr_pages; i++)
>                                 SetPageUptodate(pages[i]);
>
> --
> 2.34.1
>
>
>
> Phillip
>
>
> >> This is important for the decompression code, because it
> >> expects each pages array entry to reference a page, which
> >> can be kmapped to an address.  If an entry in the pages
> >> array is NULL, this will break.
> >>
> >> If the pages array can have holes (NULL pointers), I have
> >> written an update patch which allows the decompression code
> >> to handle these NULL pointers.
> >>
> >> If the pages array can have NULL pointers, I can send you
> >> the patch which will deal with this.
> >
> > Sure, if there are better ways to deal with this.
> >
> > Thanks.
> >
> >>
> >> Thanks
> >>
> >> Phillip
> >>
> >>
> >>
> >>>
> >>>>>
> >>>>> It's also noticed that when the crash happened, nr_pages obtained by
> >>>>> readahead_count() is 512.
> >>>>> nr_pages = readahead_count(ractl); // this line
> >>>>>
> >>>>> 2) Normal cases that won't crash:
> >>>>> [   22.651750] Block @ 0xb3bbca6, compressed size 42172, src size 262144
> >>>>> [   22.653580] Block @ 0xb3c6162, compressed size 29815, src size 262144
> >>>>> [   22.656692] Block @ 0xb4a293f, compressed size 17484, src size 131072
> >>>>> [   22.666099] Block @ 0xb593881, compressed size 39742, src size 262144
> >>>>> [   22.668699] Block @ 0xb59d3bf, compressed size 37841, src size 262144
> >>>>> [   22.695739] Block @ 0x13698673, compressed size 65907, src size 131072
> >>>>> [   22.698619] Block @ 0x136a87e6, compressed size 3155, src size 131072
> >>>>> [   22.703400] Block @ 0xb1babe8, compressed size 99391, src size 131072
> >>>>> [   22.706288] Block @ 0x1514abc6, compressed size 4627, src size 131072
> >>>>>
> >>>>> nr_pages are observed to be 32, 64, 256... These won't cause a crash.
> >>>>> Other values (max_pages, bsize, block...) looks normal
> >>>>>
> >>>>> I'm not sure why the crash happened, but I tried to modify the mask
> >>>>> for a bit. After modifying the mask value to below, the crash is gone
> >>>>> (nr_pages are <=256).
> >>>>> Based on my testing on a 300K pack file, there's no performance change.
> >>>>>
> >>>>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> >>>>> index 20ec48cf97c5..f6d9b6f88ed9 100644
> >>>>> --- a/fs/squashfs/file.c
> >>>>> +++ b/fs/squashfs/file.c
> >>>>> @@ -499,8 +499,8 @@ static void squashfs_readahead(struct
> >>>>> readahead_control *ractl)
> >>>>>     {
> >>>>>            struct inode *inode = ractl->mapping->host;
> >>>>>            struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> >>>>> -       size_t mask = (1UL << msblk->block_log) - 1;
> >>>>>            size_t shift = msblk->block_log - PAGE_SHIFT;
> >>>>> +       size_t mask = (1UL << shift) - 1;
> >>>>>
> >>>>>
> >>>>> Any pointers are appreciated. Thanks!
> >>>>
> >>
From b24e7e6068f3e56a66b914798bbc4dd84a84b1ca Mon Sep 17 00:00:00 2001
From: Hsin-Yi Wang <hsinyi@xxxxxxxxxxxx>
Date: Sun, 10 Oct 2021 21:22:25 +0800
Subject: [PATCH] squashfs: implement readahead

Implement readahead callback for squashfs. It will read datablocks
which cover pages in readahead request. For a few cases it will
not mark page as uptodate, including:
- file end is 0.
- zero filled blocks.
- current batch of pages isn't in the same datablock or not enough in a
  datablock.
Otherwise pages will be marked as uptodate. The unhandled pages will be
updated by readpage later.

Signed-off-by: Hsin-Yi Wang <hsinyi@xxxxxxxxxxxx>
---
 fs/squashfs/file.c | 79 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 78 insertions(+), 1 deletion(-)

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index 89d492916dea..48b134a315b7 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -39,6 +39,7 @@
 #include "squashfs_fs_sb.h"
 #include "squashfs_fs_i.h"
 #include "squashfs.h"
+#include "page_actor.h"
 
 /*
  * Locate cache slot in range [offset, index] for specified inode.  If
@@ -494,7 +495,83 @@ static int squashfs_readpage(struct file *file, struct page *page)
 	return 0;
 }
 
+static void squashfs_readahead(struct readahead_control *ractl)
+{
+	struct inode *inode = ractl->mapping->host;
+	struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
+	size_t mask = (1UL << msblk->block_log) - 1;
+	size_t shift = msblk->block_log - PAGE_SHIFT;
+	loff_t req_end = readahead_pos(ractl) + readahead_length(ractl);
+	loff_t start = readahead_pos(ractl) &~ mask;
+	size_t len = readahead_length(ractl) + readahead_pos(ractl) - start;
+	struct squashfs_page_actor *actor;
+	unsigned int nr_pages = 0;
+	struct page **pages;
+	u64 block = 0;
+	int bsize, res, i, index;
+	int file_end = i_size_read(inode) >> msblk->block_log;
+	unsigned int max_pages = 1UL << shift;
+
+	readahead_expand(ractl, start, (len | mask) + 1);
+
+	if (readahead_pos(ractl) + readahead_length(ractl) < req_end ||
+	    file_end == 0)
+		return;
+
+	pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL);
+	if (!pages)
+		return;
+
+	actor = squashfs_page_actor_init_special(pages, max_pages, 0);
+	if (!actor)
+		goto out;
+
+	for (;;) {
+		nr_pages = __readahead_batch(ractl, pages, max_pages);
+		if (!nr_pages)
+			break;
+
+		if (readahead_pos(ractl) >= i_size_read(inode) ||
+		    nr_pages < max_pages)
+			goto skip_pages;
+
+		index = pages[0]->index >> shift;
+		if ((pages[nr_pages - 1]->index >> shift) != index)
+			goto skip_pages;
+
+		bsize = read_blocklist(inode, index, &block);
+		if (bsize == 0)
+			goto skip_pages;
+
+		res = squashfs_read_data(inode->i_sb, block, bsize, NULL,
+					 actor);
+
+		if (res >= 0)
+			for (i = 0; i < nr_pages; i++)
+				SetPageUptodate(pages[i]);
+
+		for (i = 0; i < nr_pages; i++) {
+			unlock_page(pages[i]);
+			put_page(pages[i]);
+		}
+	}
+
+	kfree(actor);
+	kfree(pages);
+	return;
+
+skip_pages:
+	for (i = 0; i < nr_pages; i++) {
+		unlock_page(pages[i]);
+		put_page(pages[i]);
+	}
+
+	kfree(actor);
+out:
+	kfree(pages);
+}
 
 const struct address_space_operations squashfs_aops = {
-	.readpage = squashfs_readpage
+	.readpage = squashfs_readpage,
+	.readahead = squashfs_readahead
 };
-- 
2.31.0


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux