Hi Jason, Could you add "Tested-by" tag from me? Thanks! BR. On 2023-04-01 at 14:29:33 +0800, Pengfei Xu wrote: > Hi Jason, > > On 2023-03-31 at 12:32:26 -0300, Jason Gunthorpe wrote: > > If batch->end is 0 then setting npfns[0] before computing the new value of > > pfns will fail to adjust the pfn and result in various page accounting > > corruptions. It should be ordered after. > > > > This seems to result in various kinds of page meta-data corruption related > > failures: > > > > WARNING: CPU: 1 PID: 527 at mm/gup.c:75 try_grab_folio+0x503/0x740 > > Modules linked in: > > CPU: 1 PID: 527 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755+ #1 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > RIP: 0010:try_grab_folio+0x503/0x740 > > Code: e3 01 48 89 de e8 6d c1 dd ff 48 85 db 0f 84 7c fe ff ff e8 4f bf dd ff 49 8d 47 ff 48 89 45 d0 e9 73 fe ff ff e8 3d bf dd ff <0f> 0b 31 db e9 d0 fc ff ff e8 2f bf dd ff 48 8b 5d c8 31 ff 48 89 > > RSP: 0018:ffffc90000f37908 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 00000000fffffc02 RCX: ffffffff81504c26 > > RDX: 0000000000000000 RSI: ffff88800d030000 RDI: 0000000000000002 > > RBP: ffffc90000f37948 R08: 000000000003ca24 R09: 0000000000000008 > > R10: 000000000003ca00 R11: 0000000000000023 R12: ffffea000035d540 > > R13: 0000000000000001 R14: 0000000000000000 R15: ffffea000035d540 > > FS: 00007fecbf659740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000000200011c3 CR3: 000000000ef66006 CR4: 0000000000770ee0 > > PKRU: 55555554 > > Call Trace: > > <TASK> > > internal_get_user_pages_fast+0xd32/0x2200 > > pin_user_pages_fast+0x65/0x90 > > pfn_reader_user_pin+0x376/0x390 > > pfn_reader_next+0x14a/0x7b0 > > pfn_reader_first+0x140/0x1b0 > > iopt_area_fill_domain+0x74/0x210 > > iopt_table_add_domain+0x30e/0x6e0 > > iommufd_device_selftest_attach+0x7f/0x140 > > iommufd_test+0x10ff/0x16f0 > > iommufd_fops_ioctl+0x206/0x330 > > __x64_sys_ioctl+0x10e/0x160 > > do_syscall_64+0x3b/0x90 > > entry_SYSCALL_64_after_hwframe+0x72/0xdc > > > > Cc: <stable@xxxxxxxxxxxxxxx> > > Fixes: f394576eb11d ("iommufd: PFN handling for iopt_pages") > > Reported-by: Pengfei Xu <pengfei.xu@xxxxxxxxx> > > Link: https://lore.kernel.org/r/ZBExkEW/On0ue68q@xxxxxxxxxxxxxxxx > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > --- > > drivers/iommu/iommufd/pages.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c > > index b11aace836542d..3c47846cc5efe8 100644 > > --- a/drivers/iommu/iommufd/pages.c > > +++ b/drivers/iommu/iommufd/pages.c > > @@ -294,9 +294,9 @@ static void batch_clear_carry(struct pfn_batch *batch, unsigned int keep_pfns) > > batch->npfns[batch->end - 1] < keep_pfns); > > > > batch->total_pfns = keep_pfns; > > - batch->npfns[0] = keep_pfns; > > batch->pfns[0] = batch->pfns[batch->end - 1] + > > (batch->npfns[batch->end - 1] - keep_pfns); > > + batch->npfns[0] = keep_pfns; > > batch->end = 0; > > } > > > > -- > I tested the reproduced code in the kernel with all 3 fixed patches. > Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230313_234302_try_grab_folio/repro.c > This issue was gone and the issue was fixed. > > Thanks! > BR. > > > 2.40.0 > >