Another report of the issue (different call-flow, but the same error at "shmem_read_mapping_page_gfp") at : https://lore.kernel.org/lkml/6bb8c25c-cdcf-8bca-3db2-9871a90d518f@xxxxxxxxx/T/#m52d98b6bdb05764524a118b15cec048b34e5ca76 with a tentative approval for the patch : https://lore.kernel.org/lkml/6bb8c25c-cdcf-8bca-3db2-9871a90d518f@xxxxxxxxx/T/#m24c2888a879d428cde5b34c43838301de544eb7e Thanks and Regards, Ajay On Thu, Nov 11, 2021 at 2:16 PM Ajay Garg <ajaygargnsit@xxxxxxxxx> wrote: > > commit b9d02f1bdd98 > ("mm: shmem: don't truncate page if memory failure happens") > > introduced a PageHWPoison(page) call in "shmem_read_mapping_page_gfp" > in shmem.c. > > Now, if "shmem_getpage_gfp" returns an error, page is set to ERR-page. > Therafter, calling PageHWPoison() on this ERR-page, causes KASAN to OOP > the kernel : > > ############################# > BUG: unable to handle page fault for address: fffffffffffffff4 > PF: supervisor read access in kernel mode > PF: error_code(0x0000) - not-present page > PGD 18e019067 P4D 18e019067 PUD 18e01b067 PMD 0 > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI > CPU: 0 PID: 4836 Comm: MATLAB Not tainted 5.15.0+ #18 > Hardware name: Dell Inc. Latitude E6320/0GJF11, BIOS A19 11/14/2013 > RIP: 0010:shmem_read_mapping_page_gfp+0xd3/0x140 > Code: 4c 89 ff e8 6f eb ff ff 5a 59 85 c0 74 64 48 63 d8 48 89 5d 98 be 08 00 00 00 48 89 df e8 e5 67 0c 00 48 89 df e8 6d 5c 0c 00 <48> 8b 13 48 c7 c0 fb ff ff ff f7 c2 00 00 80 00 74 30 48 ba 00 00 > RSP: 0018:ffff88806b33f998 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: fffffffffffffff4 RCX: ffffffffb7a37ba3 > RDX: 0000000000000003 RSI: dffffc0000000000 RDI: fffffffffffffff4 > RBP: ffff88806b33fa20 R08: 1ffffffffffffffe R09: fffffffffffffffb > R10: fffffbffffffffff R11: 0000000000000001 R12: 1ffff1100d667f33 > R13: 00000000001120d2 R14: 00000000000005db R15: ffff88814e64e2d8 > FS: 00007f379a384640(0000) GS:ffff888161a00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: fffffffffffffff4 CR3: 00000000269dc004 CR4: 00000000000606f0 > Call Trace: > <TASK> > ? shmem_fault+0x480/0x480 > ? __cond_resched+0x1c/0x30 > ? __kasan_check_read+0x11/0x20 > shmem_get_pages+0x3a4/0xa70 [i915] > ? shmem_writeback+0x3b0/0x3b0 [i915] > ? i915_gem_object_wait_reservation+0x330/0x560 [i915] > ... > ... > ################################ > > So, we proceed with PageHWPoison() call, only if the page is not a > ERR-page. > > > P.S. : Alternative (optimised) solution : > =========================================== > > We could save some CPU cycles, if we directly replace > > if (error) > page = ERR_PTR(error); > else > unlock_page(page); > > with > > if (error) > return ERR_PTR(error); > > > Fixes: b9d02f1bdd98 ("mm: shmem: don't truncate page if memory failure happens") > Signed-off-by: Ajay Garg <ajaygargnsit@xxxxxxxxx> > --- > mm/shmem.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 23c91a8beb78..427863cbf0dc 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -4222,7 +4222,7 @@ struct page *shmem_read_mapping_page_gfp(struct address_space *mapping, > else > unlock_page(page); > > - if (PageHWPoison(page)) > + if (!IS_ERR(page) && PageHWPoison(page)) > page = ERR_PTR(-EIO); > > return page; > -- > 2.30.2 >