On Tue 22-01-19 23:29:04, Qian Cai wrote: > Running LTP migrate_pages03 [1] a few times triggering BUG() below on an arm64 > ThunderX2 server. Reverted the commit 9a1ea439b16b9 ("mm: > put_and_wait_on_page_locked() while page is migrated") allows it to run > continuously. > > put_and_wait_on_page_locked > wait_on_page_bit_common > put_page > put_page_testzero > VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); > > [1] > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/migrate_pages/migrate_pages03.c > > [ 1304.643587] page:ffff7fe0226ff000 count:2 mapcount:0 mapping:ffff8095c3406d58 index:0x7 > [ 1304.652082] xfs_address_space_operations [xfs] [...] > [ 1304.682652] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) This looks like a page reference countimbalance to me. The page seemed to be freed at the the migration code (wait_on_page_bit_common) called put_page and immediatelly got reused for xfs allocation and that is why we see its ref count==2. But I fail to see how that is possible as __migration_entry_wait already does get_page_unless_zero so the imbalance must have been preexisting. -- Michal Hocko SUSE Labs