On Wed, Jul 20, 2016 at 5:33 PM, Vlastimil Babka <vbabka@xxxxxxx> wrote: > On 07/20/2016 05:19 PM, Kirill A. Shutemov wrote: >> On Wed, Jul 20, 2016 at 03:39:24PM +0200, Vlastimil Babka wrote: >>> On 07/20/2016 01:53 PM, Michal Hocko wrote: >>>> On Wed 20-07-16 14:44:17, Kirill A. Shutemov wrote: >>>>> Hello, >>>>> >>>>> Looks like current mmotm is broken. See trace below. >>>> >>>> Why do you think it is broken? This is order-2 NOWAIT allocation. So we >>>> are relying on atomic highorder reserve and kcompactd to make sufficient >>>> progress. It is hard to find out more without the full log including the >>>> meminfo. >>> >>> Also it seems to come from kasan allocating stackdepot space to record >>> who freed a slab object, or something. >>> >>>>> It's easy to reproduce in my setup: virtual machine with some amount of >>>>> swap space and try allocate about the size of RAM in userspace (I used >>>>> usemem[1] for that). >>>> >>>> Have you tried to bisect it? >> >> Bisected to a590d2628f08 ("mm, kasan: switch SLUB to stackdepot, enable >> memory quarantine for SLUB"). >> >> I guess it's candidate for __GFP_WARN. Not sure if there's a better >> solution. > > An order-0 fallback maybe? > Agree with NOWARN, if stackdepot (or its users) are able to tell that a > trace is missing because allocation has failed - the precise allocation > trace isn't that useful I guess. Order-2 allocation that's potentially > atomic and frequent just can fail. > >> This helps: >> >> diff --git a/lib/stackdepot.c b/lib/stackdepot.c >> index 53ad6c0831ae..60f77f1d470a 100644 >> --- a/lib/stackdepot.c >> +++ b/lib/stackdepot.c >> @@ -242,6 +242,7 @@ depot_stack_handle_t depot_save_stack(struct stack_trace *trace, >> */ >> alloc_flags &= ~GFP_ZONEMASK; >> alloc_flags &= (GFP_ATOMIC | GFP_KERNEL); >> + alloc_flags |= __GFP_NOWARN; >> page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER); >> if (page) >> prealloc = page_address(page); >> >>>> Some of the recent compaction changes might have made a difference. >>> >>> AFAIK recent compaction changes are not in mmotm yet. The node-based lru >>> reclaim might have shifted some balances perhaps. >>> >>>>> Any clues? >>>>> >>>>> [1] http://www.spinics.net/lists/linux-mm/attachments/gtarazbJaHPaAT.gtar >>>>> >>>>> [ 39.413099] kswapd2: page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) >>>>> [ 39.414122] CPU: 2 PID: 64 Comm: kswapd2 Not tainted 4.7.0-rc7-mm1-00428-gc3e13e4dab1b-dirty #2878 >>>>> [ 39.416018] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 >>>>> [ 39.416018] 0000000000000002 ffff88002f807690 ffffffff81c8fb0d 1ffff10005f00ed6 >>>>> [ 39.416018] 0000000000000000 0000000000000002 ffff88002f807750 ffff88002f8077a8 >>>>> [ 39.416018] ffffffff813e728b ffff88002f8077a8 0200000000000000 0000000041b58ab3 >>>>> [ 39.416018] Call Trace: >>>>> [ 39.416018] <IRQ> [<ffffffff81c8fb0d>] dump_stack+0x95/0xe8 >>>>> [ 39.416018] [<ffffffff813e728b>] warn_alloc_failed+0x1cb/0x250 >>>>> [ 39.416018] [<ffffffff813e70c0>] ? zone_watermark_ok_safe+0x250/0x250 >>>>> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 >>>>> [ 39.416018] [<ffffffff813e7f4c>] __alloc_pages_nodemask+0x92c/0x1fe0 >>>>> [ 39.416018] [<ffffffff8119047c>] ? sched_clock_cpu+0x12c/0x1e0 >>>>> [ 39.416018] [<ffffffff81d24a17>] ? depot_save_stack+0x1b7/0x5b0 >>>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff81313595>] ? is_ftrace_trampoline+0xe5/0x120 >>>>> [ 39.416018] [<ffffffff813e7620>] ? __free_pages+0x90/0x90 >>>>> [ 39.416018] [<ffffffff811f9870>] ? debug_show_all_locks+0x290/0x290 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 >>>>> [ 39.416018] [<ffffffff81245617>] ? debug_lockdep_rcu_enabled+0x77/0x90 >>>>> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 >>>>> [ 39.416018] [<ffffffff8105d05b>] ? print_context_stack+0x7b/0x100 >>>>> [ 39.416018] [<ffffffff814d42dc>] alloc_pages_current+0xbc/0x1f0 >>>>> [ 39.416018] [<ffffffff81d24d5f>] depot_save_stack+0x4ff/0x5b0 >>>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff814eb4c7>] kasan_slab_free+0x157/0x180 >>>>> [ 39.416018] [<ffffffff8107c58b>] ? save_stack_trace+0x2b/0x50 >>>>> [ 39.416018] [<ffffffff814eb453>] ? kasan_slab_free+0xe3/0x180 >>>>> [ 39.416018] [<ffffffff814e73e5>] ? kmem_cache_free+0x95/0x300 >>>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff813d59a9>] ? mempool_free+0xd9/0x1d0 >>>>> [ 39.416018] [<ffffffff81be96e5>] ? bio_free+0x145/0x220 >>>>> [ 39.416018] [<ffffffff81bea8bf>] ? bio_put+0x8f/0xb0 >>>>> [ 39.416018] [<ffffffff814a7bfe>] ? end_swap_bio_write+0x22e/0x310 >>>>> [ 39.416018] [<ffffffff81bf1687>] ? bio_endio+0x187/0x1f0 >>>>> [ 39.416018] [<ffffffff81c0e89b>] ? blk_update_request+0x1bb/0xc30 >>>>> [ 39.416018] [<ffffffff81c3238c>] ? blk_mq_end_request+0x4c/0x130 >>>>> [ 39.416018] [<ffffffff8208a330>] ? virtblk_request_done+0xb0/0x2a0 >>>>> [ 39.416018] [<ffffffff81c2d17d>] ? __blk_mq_complete_request_remote+0x5d/0x70 >>>>> [ 39.416018] [<ffffffff8129fe3c>] ? flush_smp_call_function_queue+0xdc/0x3a0 >>>>> [ 39.416018] [<ffffffff812a0548>] ? generic_smp_call_function_single_interrupt+0x18/0x20 >>>>> [ 39.416018] [<ffffffff8109c654>] ? smp_call_function_single_interrupt+0x64/0x90 >>>>> [ 39.416018] [<ffffffff829584a9>] ? call_function_single_interrupt+0x89/0x90 >>>>> [ 39.416018] [<ffffffff81c350f6>] ? blk_mq_map_request+0xe6/0xc00 >>>>> [ 39.416018] [<ffffffff81c36f6f>] ? blk_sq_make_request+0x9af/0xca0 >>>>> [ 39.416018] [<ffffffff81c0b05e>] ? generic_make_request+0x30e/0x660 >>>>> [ 39.416018] [<ffffffff81c0b540>] ? submit_bio+0x190/0x470 >>>>> [ 39.416018] [<ffffffff814a8fd8>] ? __swap_writepage+0x6e8/0x940 >>>>> [ 39.416018] [<ffffffff814a926a>] ? swap_writepage+0x3a/0x70 >>>>> [ 39.416018] [<ffffffff8141376b>] ? shrink_page_list+0x1bdb/0x2f00 >>>>> [ 39.416018] [<ffffffff81416038>] ? shrink_inactive_list+0x538/0xc70 >>>>> [ 39.416018] [<ffffffff81417a1b>] ? shrink_node_memcg+0xa1b/0x1160 >>>>> [ 39.416018] [<ffffffff81418436>] ? shrink_node+0x2d6/0xc60 >>>>> [ 39.416018] [<ffffffff8141bf1e>] ? kswapd+0x82e/0x1460 >>>>> [ 39.416018] [<ffffffff81156d4a>] ? kthread+0x24a/0x2e0 >>>>> [ 39.416018] [<ffffffff8295773f>] ? ret_from_fork+0x1f/0x40 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 >>>>> [ 39.416018] [<ffffffff814e73c5>] ? kmem_cache_free+0x75/0x300 >>>>> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 >>>>> [ 39.416018] [<ffffffff814e747f>] ? kmem_cache_free+0x12f/0x300 >>>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff814e73e5>] kmem_cache_free+0x95/0x300 >>>>> [ 39.416018] [<ffffffff813d5aa0>] ? mempool_free+0x1d0/0x1d0 >>>>> [ 39.416018] [<ffffffff813d5ac2>] mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff813d59a9>] mempool_free+0xd9/0x1d0 >>>>> [ 39.416018] [<ffffffff81be96e5>] bio_free+0x145/0x220 >>>>> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >>>>> [ 39.416018] [<ffffffff81bea8bf>] bio_put+0x8f/0xb0 >>>>> [ 39.416018] [<ffffffff814a7bfe>] end_swap_bio_write+0x22e/0x310 >>>>> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >>>>> [ 39.416018] [<ffffffff81bf1687>] bio_endio+0x187/0x1f0 >>>>> [ 39.416018] [<ffffffff81c0e89b>] blk_update_request+0x1bb/0xc30 >>>>> [ 39.416018] [<ffffffff81c2d120>] ? blkdev_issue_zeroout+0x3f0/0x3f0 >>>>> [ 39.416018] [<ffffffff81c3238c>] blk_mq_end_request+0x4c/0x130 >>>>> [ 39.416018] [<ffffffff8208a330>] virtblk_request_done+0xb0/0x2a0 >>>>> [ 39.416018] [<ffffffff81c2d17d>] __blk_mq_complete_request_remote+0x5d/0x70 >>>>> [ 39.416018] [<ffffffff8129fe3c>] flush_smp_call_function_queue+0xdc/0x3a0 >>>>> [ 39.416018] [<ffffffff812a0548>] generic_smp_call_function_single_interrupt+0x18/0x20 >>>>> [ 39.416018] [<ffffffff8109c654>] smp_call_function_single_interrupt+0x64/0x90 >>>>> [ 39.416018] [<ffffffff829584a9>] call_function_single_interrupt+0x89/0x90 >>>>> [ 39.416018] <EOI> [<ffffffff8120138b>] ? lock_acquire+0x15b/0x340 >>>>> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 >>>>> [ 39.416018] [<ffffffff81c350f6>] blk_mq_map_request+0xe6/0xc00 >>>>> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 >>>>> [ 39.416018] [<ffffffff81c8d434>] ? blk_integrity_merge_bio+0xb4/0x3b0 >>>>> [ 39.416018] [<ffffffff81c35010>] ? blk_mq_alloc_request+0x490/0x490 >>>>> [ 39.416018] [<ffffffff81c0dd66>] ? blk_attempt_plug_merge+0x226/0x2c0 >>>>> [ 39.416018] [<ffffffff81c36f6f>] blk_sq_make_request+0x9af/0xca0 >>>>> [ 39.416018] [<ffffffff81c365c0>] ? blk_mq_insert_requests+0x940/0x940 >>>>> [ 39.416018] [<ffffffff81c07d20>] ? blk_exit_rl+0x60/0x60 >>>>> [ 39.416018] [<ffffffff81c027b0>] ? handle_bad_sector+0x1e0/0x1e0 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81c0b05e>] generic_make_request+0x30e/0x660 >>>>> [ 39.416018] [<ffffffff81c0ad50>] ? blk_plug_queued_count+0x160/0x160 >>>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 >>>>> [ 39.416018] [<ffffffff8152890b>] ? unlock_page_memcg+0x7b/0x130 >>>>> [ 39.416018] [<ffffffff81c0b540>] submit_bio+0x190/0x470 >>>>> [ 39.416018] [<ffffffff811dff60>] ? woken_wake_function+0x60/0x60 >>>>> [ 39.416018] [<ffffffff81c0b3b0>] ? generic_make_request+0x660/0x660 >>>>> [ 39.416018] [<ffffffff813f769d>] ? __test_set_page_writeback+0x36d/0x8c0 >>>>> [ 39.416018] [<ffffffff814a8fd8>] __swap_writepage+0x6e8/0x940 >>>>> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >>>>> [ 39.416018] [<ffffffff814a88f0>] ? generic_swapfile_activate+0x490/0x490 >>>>> [ 39.416018] [<ffffffff814abd45>] ? swap_info_get+0x165/0x240 >>>>> [ 39.416018] [<ffffffff814affda>] ? page_swapcount+0xba/0xf0 >>>>> [ 39.416018] [<ffffffff82956ba1>] ? _raw_spin_unlock+0x31/0x50 >>>>> [ 39.416018] [<ffffffff814affdf>] ? page_swapcount+0xbf/0xf0 >>>>> [ 39.416018] [<ffffffff814a926a>] swap_writepage+0x3a/0x70 >>>>> [ 39.416018] [<ffffffff8141376b>] shrink_page_list+0x1bdb/0x2f00 >>>>> [ 39.416018] [<ffffffff81411b90>] ? putback_lru_page+0x3b0/0x3b0 >>>>> [ 39.416018] [<ffffffff81cef9ac>] ? __this_cpu_preempt_check+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff81438ed4>] ? __mod_node_page_state+0x94/0xe0 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 >>>>> [ 39.416018] [<ffffffff8140fcb0>] ? __isolate_lru_page+0x3b0/0x3b0 >>>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 >>>>> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 >>>>> [ 39.416018] [<ffffffff81416038>] shrink_inactive_list+0x538/0xc70 >>>>> [ 39.416018] [<ffffffff81415b00>] ? putback_inactive_pages+0xaa0/0xaa0 >>>>> [ 39.416018] [<ffffffff81416770>] ? shrink_inactive_list+0xc70/0xc70 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cd5780>] ? _find_next_bit.part.0+0xe0/0x120 >>>>> [ 39.416018] [<ffffffff8140e654>] ? pgdat_reclaimable_pages+0x764/0x9d0 >>>>> [ 39.416018] [<ffffffff8140f2ec>] ? pgdat_reclaimable+0x13c/0x1d0 >>>>> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 >>>>> [ 39.416018] [<ffffffff81417a1b>] shrink_node_memcg+0xa1b/0x1160 >>>>> [ 39.416018] [<ffffffff81417000>] ? shrink_active_list+0x890/0x890 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff8151f208>] ? mem_cgroup_iter+0x1b8/0xd10 >>>>> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 >>>>> [ 39.416018] [<ffffffff81418436>] shrink_node+0x2d6/0xc60 >>>>> [ 39.416018] [<ffffffff81418160>] ? shrink_node_memcg+0x1160/0x1160 >>>>> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 >>>>> [ 39.416018] [<ffffffff8141bf1e>] kswapd+0x82e/0x1460 >>>>> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 >>>>> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 >>>>> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 >>>>> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 >>>>> [ 39.416018] [<ffffffff82956c37>] ? _raw_spin_unlock_irq+0x37/0x50 >>>>> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 >>>>> [ 39.416018] [<ffffffff811de8f0>] ? __wake_up_common+0x150/0x150 >>>>> [ 39.416018] [<ffffffff82948c3e>] ? __schedule+0x55e/0x1b60 >>>>> [ 39.416018] [<ffffffff81156a32>] ? __kthread_parkme+0x172/0x240 >>>>> [ 39.416018] [<ffffffff81156d4a>] kthread+0x24a/0x2e0 >>>>> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 >>>>> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 >>>>> [ 39.416018] [<ffffffff81171c9c>] ? finish_task_switch+0x14c/0x5b0 >>>>> [ 39.416018] [<ffffffff8295773f>] ret_from_fork+0x1f/0x40 >>>>> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 >>>>> >>>>> -- >>>>> Kirill A. Shutemov >>>> >>> >> > Am I understanding right that you're seeing allocation failures from the stack depot? How often do they happen? Are they reported under heavy load, or just when you boot the kernel? Allocating with __GFP_NOWARN will help here, but I think we'd better figure out what's gone wrong. I've sent https://lkml.org/lkml/2016/7/14/566, which should reduce the stack depot's memory consumption, for review - can you see if the bug is still reproducible with that? -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href