On Sat, Feb 8, 2025 at 7:46 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > On Sat, Feb 8, 2025 at 2:11 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > On Fri, Feb 07, 2025 at 04:22:56PM -0800, Joanne Koong wrote: > > > > Thanks, Josef. I guess we can at least try to confirm we're on the right track. > > > > Can anyone affected see if this (only compile tested) patch fixes the issue? > > > > Created on top of 6.13.1. > > > > > > This fixes the crash for me on 6.14.0-rc1. I ran the repro using > > > Mantas's instructions for Obfuscate. I was able to trigger the crash > > > on a clean build and then with this patch, I'm not seeing the crash > > > anymore. > > > > Since this patch fixes the bug, we're looking for one call to folio_put() > > too many. Is it possibly in fuse_try_move_page()? In particular, this > > one: > > > > /* Drop ref for ap->pages[] array */ > > folio_put(oldfolio); > > > > I don't know fuse very well. Maybe this isn't it. > > Yeah, this looks it to me. We don't grab a folio reference for the > ap->pages[] array for readahead and it tracks with Mantas's > fuse_dev_splice_write() dmesg. this patch fixed the crash for me when > I tested it yesterday: > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index 7d92a5479998..172cab8e2caf 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -955,8 +955,10 @@ static void fuse_readpages_end(struct fuse_mount > *fm, struct fuse_args *args, > fuse_invalidate_atime(inode); > } > > - for (i = 0; i < ap->num_folios; i++) > + for (i = 0; i < ap->num_folios; i++) { > folio_end_read(ap->folios[i], !err); > + folio_put(ap->folios[i]); > + } > if (ia->ff) > fuse_file_put(ia->ff, false); > > @@ -1049,6 +1051,7 @@ static void fuse_readahead(struct readahead_control *rac) > > while (ap->num_folios < cur_pages) { > folio = readahead_folio(rac); > + folio_get(folio); > ap->folios[ap->num_folios] = folio; > ap->descs[ap->num_folios].length = folio_size(folio); > ap->num_folios++; > > > I reran it just now with a printk by that ref drop in > fuse_try_move_page() and I'm indeed seeing that path get hit. > > Not sure why fstests didn't pick this up though since splice is > enabled by default in passthrough_hp, i'll look into this next week. This wasn't hit in fstests because passthrough_hp doesn't set SPLICE_F_MOVE. After adding that, I was able to trigger this crash by running generic/075. I'll send out a libfuse pr for this