On 09/10/2014 08:47 AM, Mel Gorman wrote: > migrate: debug patch to try identify race between migration completion and mprotect > > A migration entry is marked as write if pte_write was true at the > time the entry was created. The VMA protections are not double checked > when migration entries are being removed but mprotect itself will mark > write-migration-entries as read to avoid problems. It means we potentially > take a spurious fault to mark these ptes write again but otherwise it's > harmless. Still, one dump indicates that this situation can actually > happen so this debugging patch spits out a warning if the situation occurs > and hopefully the resulting warning will contain a clue as to how exactly > it happens > > Not-signed-off > --- > mm/migrate.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 09d489c..631725c 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -146,8 +146,16 @@ static int remove_migration_pte(struct page *new, struct vm_area_struct *vma, > pte = pte_mkold(mk_pte(new, vma->vm_page_prot)); > if (pte_swp_soft_dirty(*ptep)) > pte = pte_mksoft_dirty(pte); > - if (is_write_migration_entry(entry)) > - pte = pte_mkwrite(pte); > + if (is_write_migration_entry(entry)) { > + /* > + * This WARN_ON_ONCE is temporary for the purposes of seeing if > + * it's a case encountered by trinity in Sasha's testing > + */ > + if (!(vma->vm_flags & (VM_WRITE))) > + WARN_ON_ONCE(1); > + else > + pte = pte_mkwrite(pte); > + } > #ifdef CONFIG_HUGETLB_PAGE > if (PageHuge(new)) { > pte = pte_mkhuge(pte); I seem to have hit this warning: [ 4782.617806] WARNING: CPU: 10 PID: 21180 at mm/migrate.c:155 remove_migration_pte+0x3f7/0x420() [ 4782.619315] Modules linked in: [ 4782.622189] [ 4782.622501] CPU: 10 PID: 21180 Comm: trinity-main Tainted: G W 3.17.0-rc4-next-20140910-sasha-00032-g6825fb5-dirty #1137 [ 4782.624344] 0000000000000009 ffff8800193eb770 ffffffffa04c742a 0000000000000000 [ 4782.627801] ffff8800193eb7a8 ffffffff9d16e55d 00007f2458d89000 ffff880120959600 [ 4782.629283] ffff88012b02c000 ffffea002abeab00 ffff88063118da90 ffff8800193eb7b8 [ 4782.631353] Call Trace: [ 4782.633789] [<ffffffffa04c742a>] dump_stack+0x4e/0x7a [ 4782.634314] [<ffffffff9d16e55d>] warn_slowpath_common+0x7d/0xa0 [ 4782.634877] [<ffffffff9d16e63a>] warn_slowpath_null+0x1a/0x20 [ 4782.635430] [<ffffffff9d315487>] remove_migration_pte+0x3f7/0x420 [ 4782.636042] [<ffffffff9d2e99cf>] rmap_walk+0xef/0x380 [ 4782.636544] [<ffffffff9d3147f1>] remove_migration_ptes+0x41/0x50 [ 4782.637130] [<ffffffff9d315090>] ? __migration_entry_wait.isra.24+0x160/0x160 [ 4782.639928] [<ffffffff9d3154b0>] ? remove_migration_pte+0x420/0x420 [ 4782.640616] [<ffffffff9d31671b>] move_to_new_page+0x16b/0x230 [ 4782.641251] [<ffffffff9d2e9e8c>] ? try_to_unmap+0x6c/0xf0 [ 4782.643950] [<ffffffff9d2e88a0>] ? try_to_unmap_nonlinear+0x5c0/0x5c0 [ 4782.644690] [<ffffffff9d2e70a0>] ? invalid_migration_vma+0x30/0x30 [ 4782.645273] [<ffffffff9d2e82e0>] ? page_remove_rmap+0x320/0x320 [ 4782.646072] [<ffffffff9d31717c>] migrate_pages+0x85c/0x930 [ 4782.646701] [<ffffffff9d2d0e20>] ? isolate_freepages_block+0x410/0x410 [ 4782.647407] [<ffffffff9d2cfa60>] ? arch_local_save_flags+0x30/0x30 [ 4782.648114] [<ffffffff9d2d1803>] compact_zone+0x4d3/0x8a0 [ 4782.650157] [<ffffffff9d2d1c2f>] compact_zone_order+0x5f/0xa0 [ 4782.651014] [<ffffffff9d2d1f87>] try_to_compact_pages+0x127/0x2f0 [ 4782.651656] [<ffffffff9d2b0c98>] __alloc_pages_direct_compact+0x68/0x200 [ 4782.652313] [<ffffffff9d2b17ca>] __alloc_pages_nodemask+0x99a/0xd90 [ 4782.652916] [<ffffffff9d300a1c>] alloc_pages_vma+0x13c/0x270 [ 4782.653618] [<ffffffff9d31d914>] ? do_huge_pmd_wp_page+0x494/0xc90 [ 4782.654487] [<ffffffff9d31d914>] do_huge_pmd_wp_page+0x494/0xc90 [ 4782.656045] [<ffffffff9d320d20>] ? __mem_cgroup_count_vm_event+0xd0/0x240 [ 4782.657089] [<ffffffff9d2dcb7d>] handle_mm_fault+0x8bd/0xc50 [ 4782.660931] [<ffffffff9d1d26e6>] ? __lock_is_held+0x56/0x80 [ 4782.662695] [<ffffffff9d0c7bc7>] __do_page_fault+0x1b7/0x660 [ 4782.663259] [<ffffffff9d1cdc5e>] ? put_lock_stats.isra.13+0xe/0x30 [ 4782.663851] [<ffffffff9d1abf41>] ? vtime_account_user+0x91/0xa0 [ 4782.664419] [<ffffffff9d2a2c35>] ? context_tracking_user_exit+0xb5/0x1b0 [ 4782.665119] [<ffffffff9db6e103>] ? __this_cpu_preempt_check+0x13/0x20 [ 4782.665969] [<ffffffff9d1ce2e2>] ? trace_hardirqs_off_caller+0xe2/0x1b0 [ 4782.666634] [<ffffffff9d0c8141>] trace_do_page_fault+0x51/0x2b0 [ 4782.667257] [<ffffffff9d0bee83>] do_async_page_fault+0x63/0xd0 [ 4782.667871] [<ffffffffa0511018>] async_page_fault+0x28/0x30 Although it wasn't followed by anything else, and I've seen the original issue getting triggered without this WARN showing up, so it seems like a different, unrelated issue? Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>