Thanks for the suggestion. With 6.1-rc3, I am able to run my test to completion 😊 I will now back out all my debug and try it again. Thanks, Badari -----Original Message----- From: Hugh Dickins <hughd@xxxxxxxxxx> Sent: Monday, October 31, 2022 12:39 PM To: Pulavarty, Badari <badari.pulavarty@xxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>; david@xxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; bfoster@xxxxxxxxxx; huangzhaoyang@xxxxxxxxx; ke.wang@xxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; zhaoyang.huang@xxxxxxxxxx; Shutemov, Kirill <kirill.shutemov@xxxxxxxxx>; Tang, Feng <feng.tang@xxxxxxxxx>; Huang, Ying <ying.huang@xxxxxxxxx>; Yin, Fengwei <fengwei.yin@xxxxxxxxx>; Hansen, Dave <dave.hansen@xxxxxxxxx>; Zanussi, Tom <tom.zanussi@xxxxxxxxx> Subject: RE: [RFC PATCH] mm: move xa forward when run across zombie page On Mon, 31 Oct 2022, Pulavarty, Badari wrote: > Hi, > > Just want to give an update on the issue, hoping to get more thoughts/suggestions. > > I have been adding lot of debug to try to root cause the issue. > When I enabled CONFIG_VM_DEBUG, I run into following assertion failure: > > [ 1810.282055] entry: 0 folio: ffe6dfc30e428040 [ 1810.282059] page > dumped because: VM_BUG_ON_PAGE(entry != folio) [ 1810.282062] BUG: ... > [ 1810.282310] __delete_from_swap_cache.cold.20+0x33/0x35 > [ 1810.282321] delete_from_swap_cache+0x50/0xa0 [ 1810.282330] > folio_free_swap+0xab/0xe0 [ 1810.282339] free_swap_cache+0x8a/0xa0 [ > 1810.282346] free_page_and_swap_cache+0x12/0xb0 > [ 1810.282356] split_huge_page_to_list+0xf13/0x10d0 <<<<<<<<<<<<<<<<<< > [ 1810.282365] madvise_cold_or_pageout_pte_range+0x528/0x1390 > [ 1810.282374] walk_pgd_range+0x5fe/0xa10 [ 1810.282383] > __walk_page_range+0x184/0x190 [ 1810.282391] > walk_page_range+0x120/0x190 [ 1810.282398] > madvise_pageout+0x10b/0x2a0 [ 1810.282406] ? > set_track_prepare+0x48/0x70 [ 1810.282415] > madvise_vma_behavior+0x2f2/0xb10 [ 1810.282422] ? > find_vma_prev+0x72/0xc0 [ 1810.282431] do_madvise+0x21b/0x440 [ > 1810.282439] damon_va_apply_scheme+0x76/0xa0 [ 1810.282448] > kdamond_fn+0xbe9/0xe10 [ 1810.282456] ? > damon_split_region_at+0x70/0x70 [ 1810.282675] kthread+0xfc/0x130 [ > 1810.282837] ? kthread_complete_and_exit+0x20/0x20 > > Since I am not using hugepages explicitly.. I recompiled the kernel > with > > CONFIG_TRANSPARENT_HUGEPAGE=n > > And my problem went away (including the original issue). For that one, please try with 6.1-rc3 (and CONFIG_TRANSPARENT_HUGEPAGE back to y). Mel put a fix to that kind of thing into 6.1-rc2, then I fixed its warning in 6.1-rc3 (git log -n2 mm/huge_memory.c tells more). Hugh