Re: (resend)WARNING: trying to isolate tail page in isolate_lru_page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 25, 2022 at 11:40:11AM -0700, Yang Shi wrote:
> On Thu, Aug 25, 2022 at 11:23 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > On Thu, Aug 25, 2022 at 10:50:19AM -0600, Yu Zhao wrote:
> > > On Thu, Aug 25, 2022 at 8:40 AM 韩天硕 <hantianshuo@xxxxxxxxx> wrote:
> > > >
> > > > Hello:
> > > >
> > > >     My Syzkaller reported me the following issue on:
> > > >
> > > >
> > > > HEAD commit: 072e51356cd5a4a1c12c1020bc054c99b98333df Merge tag 'nfs-for-5.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
> > > >
> > > > git tree: upstream
> > > >
> > > > kernel config: defconfig
> > > >
> > > > compiler: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
> > > >
> > > >
> > > > ------------[ cut here ]------------
> > > > trying to isolate tail page
> > > > WARNING: CPU: 0 PID: 6175 at mm/folio-compat.c:158 isolate_lru_page+0x130/0x140
> > > > Modules linked in:
> > > > CPU: 0 PID: 6175 Comm: syz-executor.0 Not tainted 5.18.12 #1
> > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > > RIP: 0010:isolate_lru_page+0x130/0x140
> > > > Code: c3 89 c6 e8 22 4f f2 ff 85 db 75 0d e8 a9 4d f2 ff 44 89 e0 5b 5d 41 5c c3 e8 9c 4d f2 ff 48 c7 c7 a0 be 6a 93 e8 a9 f5 69 01 <0f> 0b eb de 66 66 2e 0f 1f 84 00 00 00 00 00 90 41 54 55 48 89 fd
> > > > loop3: detected capacity change from 0 to 16383
> > > > RSP: 0018:ffff88800844f8b8 EFLAGS: 00010282
> > > > RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
> > > > RDX: ffffc90000509000 RSI: ffff8880037997c0 RDI: ffffed1001089f09
> > > > RBP: ffffea000010b040 R08: ffffffff8117b3f8 R09: 0000000000000000
> > > > R10: 0000000000000005 R11: ffffed100d2c4ead R12: 00000000fffffff0
> > > > R13: ffff88800185aff0 R14: ffffea000010b048 R15: 0000000021000000
> > > > FS:  00007f8acbd46700(0000) GS:ffff888069600000(0000) knlGS:0000000000000000
> > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 0000001b2c821000 CR3: 0000000005028005 CR4: 0000000000770ef0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > nfs4: Unknown parameter 'vfat'
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > PKRU: 55555554
> > > > Call Trace:
> > > >  <TASK>
> > > >  madvise_cold_or_pageout_pte_range+0x43b/0x8f0
> > > >  __walk_page_range+0xa48/0x1310
> > > >  walk_page_range+0x14b/0x280
> > > >  madvise_pageout+0x184/0x260
> > > >  madvise_vma_behavior+0x843/0x13f0
> > > >  do_madvise+0x310/0x5b0
> > > >  __x64_sys_madvise+0x5f/0x70
> > > >  do_syscall_64+0x38/0x90
> > > >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > > RIP: 0033:0x7f8acc5d38bd
> > > > Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> > > > RSP: 002b:00007f8acbd45bf8 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
> > > > RAX: ffffffffffffffda RBX: 00007f8acc6f2f60 RCX: 00007f8acc5d38bd
> > > > RDX: 0000000000000015 RSI: 0000000000004000 RDI: 0000000020ffc000
> > > > RBP: 00007f8acc6400a9 R08: 0000000000000000 R09: 0000000000000000
> > > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > > > R13: 00007ffec656fb0f R14: 00007ffec656fcb0 R15: 00007f8acbd45d80
> > > >  </TASK>
> > >
> > > The above is from 5.18. Another report from 5.10:
> > > https://lore.kernel.org/r/d927a335-a70b-48d3-9645-1d33cc88bd9c@xxxxxxxxxx/
> > >
> > > We also hit it on 5.4, 5.10 and 5.15:
> > >   trying to isolate tail page
> > >   WARNING: CPU: 1 PID: 4608 at mm/vmscan.c:2096
> > > isolate_lru_page+0xb4/0x527 mm/vmscan.c:2096
> > >   Modules linked in:
> >
> > Looks like my analysis from yesterday was dropped:
> >
> > : This all seems quite plausible.  The reproducer seems to (correct me
> > : if I'm wrong) create an AF_PACKET socket and mmap it.  af_packet.c
> > : seems to create compound pages and mmap them.  This isn't folio-related
> > : at all; I just moved the code that warns about it from mm/vmscan.c to
> > : folio-compat.c.
> > :
> > : Looks like a long-standing bug in MADV_PAGEOUT to me.
> 
> Such page should never be on lru, right? We could test lru before
> calling isolate_lru_page() for this case? I know isolate_lru_page()
> does the check, but the tail page warning is raised before the check.
> 
> Could the tail page warning be moved under the lru flag test? Seems
> possible, but it should need extra handling (re-set lru flag). Seems a
> little bit overkilling.

There's a number of ways of solving this.  I'm interested in seeing
which one Minchan thinks is best.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux