On Sun, May 15, 2022 at 8:55 AM Phillip Lougher <phillip@xxxxxxxxxxxxxxx> wrote: > > On 13/05/2022 07:35, Hsin-Yi Wang wrote: > > On Fri, May 13, 2022 at 1:33 PM Phillip Lougher <phillip@xxxxxxxxxxxxxxx> wrote: > >> > >> My understanding is that this call will fully populate the > >> pages array with page references without any holes. That > >> is none of the pages array entries will be NULL, meaning > >> there isn't a page for that entry. In other words, if the > >> pages array has 32 pages, each of the 32 entries will > >> reference a page. > >> > > I noticed that if nr_pages < max_pages, calling read_blocklist() will > > have SQUASHFS errors, > > > > SQUASHFS error: Failed to read block 0x125ef7d: -5 > > SQUASHFS error: zlib decompression failed, data probably corrupt > > > > so I did a check if nr_pages < max_pages before squashfs_read_data(), > > just skip the remaining pages and let them be handled by readpage. > > > > Yes that avoids passing the decompressor code a too small page range. > As such extending the decompressor code isn't necessary. > > Testing your patch I discovered a number of cases where > the decompressor still failed as above. > > This I traced to "sparse blocks", these are zero filled blocks, and > are indicated/stored as a block length of 0 (bsize == 0). Skipping > this sparse block and letting it be handled by readpage fixes this > issue. > Ack. Thanks for testing this. > I also noticed a potential performance improvement. You check for > "pages[nr_pages - 1]->index >> shift) == index" after calling > squashfs_read_data. But this information is known before > calling squashfs_read_data and moving the check to before > squashfs_read_data saves the cost of doing a redundant block > decompression. > After applying this, The performance becomes: 2.73s 2.76s 2.73s Original: 2.76s 2.79s 2.77s (The pack file is different from my previous testing in this email thread.) > Finally I noticed that if nr_pages grows after the __readahead_batch > call, then the pages array and the page actor will be too small, and > it will cause the decompressor to fail. Changing the allocation to > max_pages fixes this. > Ack. I've added the fixes patch and previous fixes. > I have rolled these fixes into the patch below (also attached in > case it gets garbled). > > diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c > index 7cd57e0d88de..14485a7af5cf 100644 > --- a/fs/squashfs/file.c > +++ b/fs/squashfs/file.c > @@ -518,13 +518,11 @@ static void squashfs_readahead(struct > readahead_control *ractl) > file_end == 0) > return; > > - nr_pages = min(readahead_count(ractl), max_pages); > - > - pages = kmalloc_array(nr_pages, sizeof(void *), GFP_KERNEL); > + pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL); > if (!pages) > return; > > - actor = squashfs_page_actor_init_special(pages, nr_pages, 0); > + actor = squashfs_page_actor_init_special(pages, max_pages, 0); > if (!actor) > goto out; > > @@ -538,11 +536,18 @@ static void squashfs_readahead(struct > readahead_control *ractl) > goto skip_pages; > > index = pages[0]->index >> shift; > + > + if ((pages[nr_pages - 1]->index >> shift) != index) > + goto skip_pages; > + > bsize = read_blocklist(inode, index, &block); > + if (bsize == 0) > + goto skip_pages; > + > res = squashfs_read_data(inode->i_sb, block, bsize, NULL, > actor); > > - if (res >= 0 && (pages[nr_pages - 1]->index >> shift) == index) > + if (res >= 0) > for (i = 0; i < nr_pages; i++) > SetPageUptodate(pages[i]); > > -- > 2.34.1 > > > > Phillip > > > >> This is important for the decompression code, because it > >> expects each pages array entry to reference a page, which > >> can be kmapped to an address. If an entry in the pages > >> array is NULL, this will break. > >> > >> If the pages array can have holes (NULL pointers), I have > >> written an update patch which allows the decompression code > >> to handle these NULL pointers. > >> > >> If the pages array can have NULL pointers, I can send you > >> the patch which will deal with this. > > > > Sure, if there are better ways to deal with this. > > > > Thanks. > > > >> > >> Thanks > >> > >> Phillip > >> > >> > >> > >>> > >>>>> > >>>>> It's also noticed that when the crash happened, nr_pages obtained by > >>>>> readahead_count() is 512. > >>>>> nr_pages = readahead_count(ractl); // this line > >>>>> > >>>>> 2) Normal cases that won't crash: > >>>>> [ 22.651750] Block @ 0xb3bbca6, compressed size 42172, src size 262144 > >>>>> [ 22.653580] Block @ 0xb3c6162, compressed size 29815, src size 262144 > >>>>> [ 22.656692] Block @ 0xb4a293f, compressed size 17484, src size 131072 > >>>>> [ 22.666099] Block @ 0xb593881, compressed size 39742, src size 262144 > >>>>> [ 22.668699] Block @ 0xb59d3bf, compressed size 37841, src size 262144 > >>>>> [ 22.695739] Block @ 0x13698673, compressed size 65907, src size 131072 > >>>>> [ 22.698619] Block @ 0x136a87e6, compressed size 3155, src size 131072 > >>>>> [ 22.703400] Block @ 0xb1babe8, compressed size 99391, src size 131072 > >>>>> [ 22.706288] Block @ 0x1514abc6, compressed size 4627, src size 131072 > >>>>> > >>>>> nr_pages are observed to be 32, 64, 256... These won't cause a crash. > >>>>> Other values (max_pages, bsize, block...) looks normal > >>>>> > >>>>> I'm not sure why the crash happened, but I tried to modify the mask > >>>>> for a bit. After modifying the mask value to below, the crash is gone > >>>>> (nr_pages are <=256). > >>>>> Based on my testing on a 300K pack file, there's no performance change. > >>>>> > >>>>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c > >>>>> index 20ec48cf97c5..f6d9b6f88ed9 100644 > >>>>> --- a/fs/squashfs/file.c > >>>>> +++ b/fs/squashfs/file.c > >>>>> @@ -499,8 +499,8 @@ static void squashfs_readahead(struct > >>>>> readahead_control *ractl) > >>>>> { > >>>>> struct inode *inode = ractl->mapping->host; > >>>>> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; > >>>>> - size_t mask = (1UL << msblk->block_log) - 1; > >>>>> size_t shift = msblk->block_log - PAGE_SHIFT; > >>>>> + size_t mask = (1UL << shift) - 1; > >>>>> > >>>>> > >>>>> Any pointers are appreciated. Thanks! > >>>> > >>
From b24e7e6068f3e56a66b914798bbc4dd84a84b1ca Mon Sep 17 00:00:00 2001 From: Hsin-Yi Wang <hsinyi@xxxxxxxxxxxx> Date: Sun, 10 Oct 2021 21:22:25 +0800 Subject: [PATCH] squashfs: implement readahead Implement readahead callback for squashfs. It will read datablocks which cover pages in readahead request. For a few cases it will not mark page as uptodate, including: - file end is 0. - zero filled blocks. - current batch of pages isn't in the same datablock or not enough in a datablock. Otherwise pages will be marked as uptodate. The unhandled pages will be updated by readpage later. Signed-off-by: Hsin-Yi Wang <hsinyi@xxxxxxxxxxxx> --- fs/squashfs/file.c | 79 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c index 89d492916dea..48b134a315b7 100644 --- a/fs/squashfs/file.c +++ b/fs/squashfs/file.c @@ -39,6 +39,7 @@ #include "squashfs_fs_sb.h" #include "squashfs_fs_i.h" #include "squashfs.h" +#include "page_actor.h" /* * Locate cache slot in range [offset, index] for specified inode. If @@ -494,7 +495,83 @@ static int squashfs_readpage(struct file *file, struct page *page) return 0; } +static void squashfs_readahead(struct readahead_control *ractl) +{ + struct inode *inode = ractl->mapping->host; + struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; + size_t mask = (1UL << msblk->block_log) - 1; + size_t shift = msblk->block_log - PAGE_SHIFT; + loff_t req_end = readahead_pos(ractl) + readahead_length(ractl); + loff_t start = readahead_pos(ractl) &~ mask; + size_t len = readahead_length(ractl) + readahead_pos(ractl) - start; + struct squashfs_page_actor *actor; + unsigned int nr_pages = 0; + struct page **pages; + u64 block = 0; + int bsize, res, i, index; + int file_end = i_size_read(inode) >> msblk->block_log; + unsigned int max_pages = 1UL << shift; + + readahead_expand(ractl, start, (len | mask) + 1); + + if (readahead_pos(ractl) + readahead_length(ractl) < req_end || + file_end == 0) + return; + + pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL); + if (!pages) + return; + + actor = squashfs_page_actor_init_special(pages, max_pages, 0); + if (!actor) + goto out; + + for (;;) { + nr_pages = __readahead_batch(ractl, pages, max_pages); + if (!nr_pages) + break; + + if (readahead_pos(ractl) >= i_size_read(inode) || + nr_pages < max_pages) + goto skip_pages; + + index = pages[0]->index >> shift; + if ((pages[nr_pages - 1]->index >> shift) != index) + goto skip_pages; + + bsize = read_blocklist(inode, index, &block); + if (bsize == 0) + goto skip_pages; + + res = squashfs_read_data(inode->i_sb, block, bsize, NULL, + actor); + + if (res >= 0) + for (i = 0; i < nr_pages; i++) + SetPageUptodate(pages[i]); + + for (i = 0; i < nr_pages; i++) { + unlock_page(pages[i]); + put_page(pages[i]); + } + } + + kfree(actor); + kfree(pages); + return; + +skip_pages: + for (i = 0; i < nr_pages; i++) { + unlock_page(pages[i]); + put_page(pages[i]); + } + + kfree(actor); +out: + kfree(pages); +} const struct address_space_operations squashfs_aops = { - .readpage = squashfs_readpage + .readpage = squashfs_readpage, + .readahead = squashfs_readahead }; -- 2.31.0