Re: [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Sun, 10 Mar 2024 19:57:08 +0000

On Sun, Mar 10, 2024 at 04:31:25PM +0000, Ryan Roberts wrote:
> That's exactly how I discovered the original problem, and was hoping
> that with your fix, this would unblock me. Given I can only repro this
> when my changes are on top, I guess my code is most likely buggy,
> but perhaps you can take a quick look at the oops and tell me what
> you think?

Well, now my code isn't implicated, I have no interest in helping you.

Just kidding ;-)

> [   96.372503] BUG: Bad page state in process usemem  pfn:be502
> [   96.373336] page: refcount:0 mapcount:0 mapping:000000005abfa8d5 index:0x0 pfn:0xbe502
> [   96.374341] aops:0x0 ino:fffffc0001f940c8
> [   96.374893] flags: 0x7fff8000000000(node=0|zone=0|lastcpupid=0xffff)
> [   96.375653] page_type: 0xffffffff()
> [   96.376071] raw: 007fff8000000000 0000000000000000 fffffc0001f94090 ffff0000c99ee860
> [   96.377055] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> [   96.378650] page dumped because: non-NULL mapping

OK, so page->mapping is ffff0000c99ee860 which does look plausible.
At least it's not a deferred_list (although it is a pfn suitable for
having a deferred_list ... for any allocation up to order-9)

> [   96.390688]  dump_stack_lvl+0x78/0xc8
> [   96.391163]  dump_stack+0x18/0x28
> [   96.391545]  bad_page+0x88/0x128
> [   96.391893]  get_page_from_freelist+0xa94/0x1bc0
> [   96.392407]  __alloc_pages+0x194/0x10b0

> [  113.131515] ------------[ cut here ]------------
> [  113.132190] UBSAN: array-index-out-of-bounds in mm/vmscan.c:1654:14
> [  113.132892] index 7 is out of range for type 'long unsigned int [5]'
> [  113.133617] CPU: 9 PID: 528 Comm: kswapd0 Tainted: G    B              6.8.0-rc5-ryarob01-swap-out-v4 #2
> [  113.134705] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> [  113.135500] Call trace:
> [  113.135776]  dump_backtrace+0x9c/0x128
> [  113.136218]  show_stack+0x20/0x38
> [  113.136574]  dump_stack_lvl+0x78/0xc8
> [  113.136964]  dump_stack+0x18/0x28
> [  113.137322]  __ubsan_handle_out_of_bounds+0xa0/0xd8
> [  113.137885]  isolate_lru_folios+0x57c/0x658

I wish it weren't UBSAN reporting this, then we could get the folio
dumped.  I suppose we could put in an explicit check for folio_zonenum()
being > 5.  Does it usually happed in isolate_lru_folio()?

> nr_skipped is a stack array of 5 elements. So I guess folio_zonemem(folio) is returning 7. That comes from the flags. I guess this is most likely just a side effect of the corrupted folio due to someone writing to it while its on the free list?

Or it's a pointer to something that's not a folio?  Are we taking the
wrong lock somewhere again?