Re: (resend)WARNING: trying to isolate tail page in isolate_lru_page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 25, 2022 at 11:23 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Thu, Aug 25, 2022 at 10:50:19AM -0600, Yu Zhao wrote:
> > On Thu, Aug 25, 2022 at 8:40 AM 韩天硕 <hantianshuo@xxxxxxxxx> wrote:
> > >
> > > Hello:
> > >
> > >     My Syzkaller reported me the following issue on:
> > >
> > >
> > > HEAD commit: 072e51356cd5a4a1c12c1020bc054c99b98333df Merge tag 'nfs-for-5.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
> > >
> > > git tree: upstream
> > >
> > > kernel config: defconfig
> > >
> > > compiler: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
> > >
> > >
> > > ------------[ cut here ]------------
> > > trying to isolate tail page
> > > WARNING: CPU: 0 PID: 6175 at mm/folio-compat.c:158 isolate_lru_page+0x130/0x140
> > > Modules linked in:
> > > CPU: 0 PID: 6175 Comm: syz-executor.0 Not tainted 5.18.12 #1
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > RIP: 0010:isolate_lru_page+0x130/0x140
> > > Code: c3 89 c6 e8 22 4f f2 ff 85 db 75 0d e8 a9 4d f2 ff 44 89 e0 5b 5d 41 5c c3 e8 9c 4d f2 ff 48 c7 c7 a0 be 6a 93 e8 a9 f5 69 01 <0f> 0b eb de 66 66 2e 0f 1f 84 00 00 00 00 00 90 41 54 55 48 89 fd
> > > loop3: detected capacity change from 0 to 16383
> > > RSP: 0018:ffff88800844f8b8 EFLAGS: 00010282
> > > RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
> > > RDX: ffffc90000509000 RSI: ffff8880037997c0 RDI: ffffed1001089f09
> > > RBP: ffffea000010b040 R08: ffffffff8117b3f8 R09: 0000000000000000
> > > R10: 0000000000000005 R11: ffffed100d2c4ead R12: 00000000fffffff0
> > > R13: ffff88800185aff0 R14: ffffea000010b048 R15: 0000000021000000
> > > FS:  00007f8acbd46700(0000) GS:ffff888069600000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000001b2c821000 CR3: 0000000005028005 CR4: 0000000000770ef0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > nfs4: Unknown parameter 'vfat'
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > PKRU: 55555554
> > > Call Trace:
> > >  <TASK>
> > >  madvise_cold_or_pageout_pte_range+0x43b/0x8f0
> > >  __walk_page_range+0xa48/0x1310
> > >  walk_page_range+0x14b/0x280
> > >  madvise_pageout+0x184/0x260
> > >  madvise_vma_behavior+0x843/0x13f0
> > >  do_madvise+0x310/0x5b0
> > >  __x64_sys_madvise+0x5f/0x70
> > >  do_syscall_64+0x38/0x90
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > RIP: 0033:0x7f8acc5d38bd
> > > Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> > > RSP: 002b:00007f8acbd45bf8 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
> > > RAX: ffffffffffffffda RBX: 00007f8acc6f2f60 RCX: 00007f8acc5d38bd
> > > RDX: 0000000000000015 RSI: 0000000000004000 RDI: 0000000020ffc000
> > > RBP: 00007f8acc6400a9 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > > R13: 00007ffec656fb0f R14: 00007ffec656fcb0 R15: 00007f8acbd45d80
> > >  </TASK>
> >
> > The above is from 5.18. Another report from 5.10:
> > https://lore.kernel.org/r/d927a335-a70b-48d3-9645-1d33cc88bd9c@xxxxxxxxxx/
> >
> > We also hit it on 5.4, 5.10 and 5.15:
> >   trying to isolate tail page
> >   WARNING: CPU: 1 PID: 4608 at mm/vmscan.c:2096
> > isolate_lru_page+0xb4/0x527 mm/vmscan.c:2096
> >   Modules linked in:
>
> Looks like my analysis from yesterday was dropped:
>
> : This all seems quite plausible.  The reproducer seems to (correct me
> : if I'm wrong) create an AF_PACKET socket and mmap it.  af_packet.c
> : seems to create compound pages and mmap them.  This isn't folio-related
> : at all; I just moved the code that warns about it from mm/vmscan.c to
> : folio-compat.c.
> :
> : Looks like a long-standing bug in MADV_PAGEOUT to me.

Such page should never be on lru, right? We could test lru before
calling isolate_lru_page() for this case? I know isolate_lru_page()
does the check, but the tail page warning is raised before the check.

Could the tail page warning be moved under the lru flag test? Seems
possible, but it should need extra handling (re-set lru flag). Seems a
little bit overkilling.

>
>





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux