Re: [REGRESSION][BISECTED] Crash with Bad page state for FUSE/Flatpak related applications since v6.13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/8/25 16:46, Joanne Koong wrote:
> On Sat, Feb 8, 2025 at 2:11 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>>
>> On Fri, Feb 07, 2025 at 04:22:56PM -0800, Joanne Koong wrote:
>> > > Thanks, Josef. I guess we can at least try to confirm we're on the right track.
>> > > Can anyone affected see if this (only compile tested) patch fixes the issue?
>> > > Created on top of 6.13.1.
>> >
>> > This fixes the crash for me on 6.14.0-rc1. I ran the repro using
>> > Mantas's instructions for Obfuscate. I was able to trigger the crash
>> > on a clean build and then with this patch, I'm not seeing the crash
>> > anymore.
>>
>> Since this patch fixes the bug, we're looking for one call to folio_put()
>> too many.  Is it possibly in fuse_try_move_page()?  In particular, this
>> one:
>>
>>         /* Drop ref for ap->pages[] array */
>>         folio_put(oldfolio);
>>
>> I don't know fuse very well.  Maybe this isn't it.
> 
> Yeah, this looks it to me. We don't grab a folio reference for the
> ap->pages[] array for readahead and it tracks with Mantas's
> fuse_dev_splice_write() dmesg. this patch fixed the crash for me when
> I tested it yesterday:
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 7d92a5479998..172cab8e2caf 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -955,8 +955,10 @@ static void fuse_readpages_end(struct fuse_mount
> *fm, struct fuse_args *args,
>                 fuse_invalidate_atime(inode);
>         }
> 
> -       for (i = 0; i < ap->num_folios; i++)
> +       for (i = 0; i < ap->num_folios; i++) {
>                 folio_end_read(ap->folios[i], !err);
> +               folio_put(ap->folios[i]);
> +       }
>         if (ia->ff)
>                 fuse_file_put(ia->ff, false);
> 
> @@ -1049,6 +1051,7 @@ static void fuse_readahead(struct readahead_control *rac)
> 
>                 while (ap->num_folios < cur_pages) {
>                         folio = readahead_folio(rac);
> +                       folio_get(folio);

This is almost the same as my patch, but balances the folio_put() in
readahead_folio() with another folio_get(), while mine uses
__readahead_folio() that does not do folio_put() in the first place.

But I think neither patch proves the extraneous folio_put() comes from
fuse_try_move_page().

>                         ap->folios[ap->num_folios] = folio;
>                         ap->descs[ap->num_folios].length = folio_size(folio);
>                         ap->num_folios++;
> 
> 
> I reran it just now with a printk by that ref drop in
> fuse_try_move_page() and I'm indeed seeing that path get hit.

It might get hit, but is it hit in the readahead paths? One way to test
would be to instead of yours above or mine change, to stop doing the
foio_put() in fuse_try_move_page(). But maybe it's called also from other
contexts that do expect it, and will leak memory otherwise.

> Not sure why fstests didn't pick this up though since splice is
> enabled by default in passthrough_hp, i'll look into this next week.
> 





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux