Re: BUG() due to "mm: put_and_wait_on_page_locked() while page is migrated"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 22-01-19 23:29:04, Qian Cai wrote:
> Running LTP migrate_pages03 [1] a few times triggering BUG() below on an arm64
> ThunderX2 server. Reverted the commit 9a1ea439b16b9 ("mm:
> put_and_wait_on_page_locked() while page is migrated") allows it to run
> continuously.
> 
> put_and_wait_on_page_locked
>   wait_on_page_bit_common
>     put_page
>       put_page_testzero
>         VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
> 
> [1]
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/migrate_pages/migrate_pages03.c
> 
> [ 1304.643587] page:ffff7fe0226ff000 count:2 mapcount:0 mapping:ffff8095c3406d58 index:0x7
> [ 1304.652082] xfs_address_space_operations [xfs]
[...]
> [ 1304.682652] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)

This looks like a page reference countimbalance to me. The page seemed
to be freed at the the migration code (wait_on_page_bit_common) called
put_page and immediatelly got reused for xfs allocation and that is why
we see its ref count==2. But I fail to see how that is possible as
__migration_entry_wait already does get_page_unless_zero so the
imbalance must have been preexisting.
-- 
Michal Hocko
SUSE Labs




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux