I noticed that when a kdump kernel triggers the OOM killer because a too small value was given to crashkernel= parameter, the OOM reaper tends to fail to reclaim memory from OOM victims because they are in dup_mm() from copy_mm() from copy_process() with mmap_sem held for write. A debug dump reported that the OOM victim was merely sleeping at might_sleep_if() in prepare_alloc_pages() from __alloc_pages_nodemask() despite the OOM victim is ready to bail out. Since copy_page_range() can be called with mmap_sem held for write, it is not a good thing to continue the loop when killed by the OOM killer. [ 9.965654] systemd-udevd invoked oom-killer: gfp_mask=0x7080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=0, oom_score_adj=0 [ 9.968941] CPU: 0 PID: 132 Comm: systemd-udevd Not tainted 5.0.0-rc8+ #838 [ 9.970801] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 [ 9.973897] Call Trace: [ 9.974735] dump_stack+0x86/0xca [ 9.975693] dump_header+0x10a/0x9d0 [ 9.976746] ? ___ratelimit+0x1d1/0x3c5 [ 9.977838] oom_kill_process.cold.31+0xb/0x59f [ 9.979078] ? check_flags.part.40+0x420/0x420 [ 9.980727] out_of_memory+0x287/0x800 [ 9.981907] ? oom_killer_disable+0x200/0x200 [ 9.983067] ? mutex_trylock+0x191/0x1e0 [ 9.984183] ? __alloc_pages_slowpath+0xa16/0x2380 [ 9.985485] __alloc_pages_slowpath+0x1cb2/0x2380 [ 9.986767] ? __zone_watermark_ok+0x213/0x370 [ 9.988014] ? warn_alloc+0x120/0x120 [ 9.989089] ? sched_clock_cpu+0x1b/0x170 [ 9.990343] ? __might_sleep+0x95/0x190 [ 9.991569] __alloc_pages_nodemask+0x515/0x610 [ 9.992843] ? __kasan_kmalloc.constprop.8+0xc5/0xd0 [ 9.994215] ? kasan_slab_alloc+0x11/0x20 [ 9.995323] ? __alloc_pages_slowpath+0x2380/0x2380 [ 9.996649] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 9.998168] ? _raw_spin_unlock+0x22/0x30 [ 9.999240] __get_free_pages+0x14/0x90 [ 10.000303] get_zeroed_page+0x11/0x20 [ 10.001391] __pud_alloc+0x2e/0x120 [ 10.002443] copy_page_range+0xf78/0x1af0 [ 10.003544] ? sched_clock_cpu+0x1b/0x170 [ 10.004658] ? sched_clock+0x9/0x10 [ 10.005646] ? find_held_lock+0x40/0x1e0 [ 10.006909] ? check_flags.part.40+0x420/0x420 [ 10.008450] ? vma_gap_callbacks_rotate+0x5a/0x90 [ 10.009766] ? __pmd_alloc+0x370/0x370 [ 10.010838] ? __vma_link_rb+0x1fc/0x340 [ 10.011963] copy_process.part.56+0x2f0e/0x6c80 [ 10.013184] ? __cleanup_sighand+0x40/0x40 [ 10.014331] ? sched_clock_cpu+0x1b/0x170 [ 10.015398] ? find_held_lock+0x40/0x1e0 [ 10.016489] ? check_flags.part.40+0x420/0x420 [ 10.017747] _do_fork+0x15d/0xb90 [ 10.018677] ? __fd_install+0x16c/0x470 [ 10.019760] ? fork_idle+0x250/0x250 [ 10.020777] ? fd_install+0x47/0x60 [ 10.021766] ? do_pipe2+0x102/0x140 [ 10.022793] ? pci_mmcfg_check_reserved+0x120/0x120 [ 10.024377] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 10.025813] ? do_syscall_64+0x18/0x3e0 [ 10.027035] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 10.029358] __x64_sys_clone+0xba/0x140 [ 10.030779] do_syscall_64+0x8f/0x3e0 [ 10.031848] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 10.033300] RIP: 0033:0x7f674d010f42 [ 10.034318] Code: f7 d8 64 89 04 25 d4 02 00 00 64 4c 8b 04 25 10 00 00 00 31 d2 4d 8d 90 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 5d 01 00 00 85 c0 41 89 c5 0f 85 67 01 00 [ 10.039645] RSP: 002b:00007ffcf9331600 EFLAGS: 00000246 ORIG_RAX: 0000000000000038 [ 10.041806] RAX: ffffffffffffffda RBX: 00007ffcf9331600 RCX: 00007f674d010f42 [ 10.043812] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011 [ 10.045766] RBP: 00007ffcf9331640 R08: 00007f674e3ef8c0 R09: 0000000000000084 [ 10.047728] R10: 00007f674e3efb90 R11: 0000000000000246 R12: 0000000000000000 [ 10.049685] R13: 0000000000000000 R14: 00007ffcf9333d20 R15: 00007ffcf9333920 [ 10.051705] Mem-Info: [ 10.052349] active_anon:3104 inactive_anon:7316 isolated_anon:0 [ 10.052349] active_file:0 inactive_file:0 isolated_file:0 [ 10.052349] unevictable:0 dirty:0 writeback:0 unstable:0 [ 10.052349] slab_reclaimable:5033 slab_unreclaimable:13704 [ 10.052349] mapped:1177 shmem:9911 pagetables:148 bounce:0 [ 10.052349] free:479 free_pcp:41 free_cma:0 [ 10.060924] Node 0 active_anon:12416kB inactive_anon:29264kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:4708kB dirty:0kB writeback:0kB shmem:39644kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes [ 10.069022] DMA free:508kB min:2052kB low:2052kB high:2052kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:600kB managed:516kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 10.076308] lowmem_reserve[]: 0 123 123 123 [ 10.077655] DMA32 free:1408kB min:1416kB low:1768kB high:2120kB active_anon:12416kB inactive_anon:29252kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:261524kB managed:126532kB mlocked:0kB kernel_stack:2656kB pagetables:592kB bounce:0kB free_pcp:164kB local_pcp:164kB free_cma:0kB [ 10.085864] lowmem_reserve[]: 0 0 0 0 [ 10.087035] DMA: 0*4kB 1*8kB (M) 1*16kB (M) 1*32kB (M) 1*64kB (U) 1*128kB (U) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 504kB [ 10.090710] DMA32: 14*4kB (UME) 9*8kB (UME) 14*16kB (UME) 7*32kB (UM) 3*64kB (UME) 1*128kB (M) 2*256kB (ME) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1408kB [ 10.094751] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [ 10.097676] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 10.100979] 9911 total pagecache pages [ 10.102191] 0 pages in swap cache [ 10.103238] Swap cache stats: add 0, delete 0, find 0/0 [ 10.104849] Free swap = 0kB [ 10.105839] Total swap = 0kB [ 10.106846] 65531 pages RAM [ 10.107724] 0 pages HighMem/MovableOnly [ 10.108903] 33769 pages reserved [ 10.109902] 0 pages cma reserved [ 10.110915] Unreclaimable slab info: [ 10.112080] Name Used Total [ 10.113686] fib6_nodes 0KB 4KB [ 10.115512] RAWv6 10KB 16KB [ 10.117043] sgpool-128 8KB 31KB [ 10.118553] sgpool-64 4KB 31KB [ 10.120101] sgpool-32 2KB 15KB [ 10.121577] sgpool-16 1KB 7KB [ 10.123167] sgpool-8 1KB 7KB [ 10.124725] mqueue_inode_cache 1KB 15KB [ 10.126273] bio-1 2KB 7KB [ 10.127752] UNIX 67KB 90KB [ 10.129229] ip_fib_trie 1KB 3KB [ 10.130686] ip_fib_alias 1KB 3KB [ 10.132116] RAW 3KB 30KB [ 10.133631] UDP 2KB 30KB [ 10.135063] hugetlbfs_inode_cache 2KB 31KB [ 10.136657] eventpoll_pwq 14KB 23KB [ 10.138054] eventpoll_epi 20KB 31KB [ 10.139521] inotify_inode_mark 2KB 3KB [ 10.141039] request_queue 3KB 31KB [ 10.142472] bio-0 2KB 7KB [ 10.143905] biovec-max 84KB 101KB [ 10.145381] bio_integrity_payload 1KB 7KB [ 10.146988] dmaengine-unmap-2 0KB 4KB [ 10.148415] audit_buffer 0KB 7KB [ 10.149869] skbuff_head_cache 244KB 311KB [ 10.151264] configfs_dir_cache 1KB 3KB [ 10.152759] fsnotify_mark_connector 2KB 3KB [ 10.154326] task_delay_info 43KB 47KB [ 10.155821] proc_dir_entry 385KB 393KB [ 10.157388] pde_opener 1KB 7KB [ 10.158846] seq_file 13KB 38KB [ 10.160273] sigqueue 0KB 7KB [ 10.161766] shmem_inode_cache 1086KB 1099KB [ 10.163256] kernfs_node_cache 23189KB 23193KB [ 10.164688] mnt_cache 30KB 31KB [ 10.166166] filp 281KB 285KB [ 10.167596] names_cache 980KB 994KB [ 10.169095] key_jar 3KB 7KB [ 10.170528] nsproxy 0KB 3KB [ 10.171954] vm_area_struct 483KB 489KB [ 10.173540] mm_struct 30KB 48KB [ 10.175039] fs_cache 6KB 15KB [ 10.176538] files_cache 13KB 30KB [ 10.177977] signal_cache 157KB 184KB [ 10.179469] sighand_cache 217KB 252KB [ 10.180919] task_struct 592KB 626KB [ 10.182349] cred_jar 63KB 78KB [ 10.183772] anon_vma_chain 364KB 368KB [ 10.185231] anon_vma 121KB 137KB [ 10.186724] pid 45KB 48KB [ 10.188546] Acpi-Operand 3938KB 4232KB [ 10.190127] Acpi-ParseExt 0KB 15KB [ 10.191627] Acpi-Parse 0KB 15KB [ 10.193048] Acpi-State 0KB 15KB [ 10.194670] Acpi-Namespace 3112KB 3127KB [ 10.196245] trace_event_file 241KB 243KB [ 10.197717] ftrace_event_field 553KB 554KB [ 10.199211] pool_workqueue 18KB 30KB [ 10.200701] task_group 6KB 15KB [ 10.202331] debug_objects_cache 1675KB 1676KB [ 10.203790] page->ptl 121KB 125KB [ 10.205269] kmalloc-8k 116KB 125KB [ 10.206769] kmalloc-4k 660KB 1033KB [ 10.208563] kmalloc-2k 3480KB 3503KB [ 10.210005] kmalloc-1k 506KB 525KB [ 10.211509] kmalloc-512 413KB 493KB [ 10.212940] kmalloc-256 1042KB 1049KB [ 10.214364] kmalloc-192 96KB 103KB [ 10.215800] kmalloc-128 503KB 506KB [ 10.217203] kmalloc-96 257KB 496KB [ 10.218730] kmalloc-64 962KB 995KB [ 10.220155] kmalloc-32 1755KB 1770KB [ 10.221622] kmalloc-16 1597KB 1604KB [ 10.223047] kmalloc-8 1370KB 1392KB [ 10.224774] kmem_cache_node 91KB 94KB [ 10.226223] kmem_cache 142KB 149KB [ 10.227648] Tasks state (memory values in pages): [ 10.228981] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [ 10.231399] [ 128] 0 128 8930 954 114688 0 -1000 systemd-udevd [ 10.233952] [ 130] 0 130 8765 523 110592 0 0 systemd-udevd [ 10.236312] [ 132] 0 132 8765 524 110592 0 0 systemd-udevd [ 10.238702] [ 180] 0 180 1162 75 45056 0 0 systemd-detect- [ 10.241295] [ 181] 0 181 7725 0 110592 0 0 systemd-journal [ 10.243763] [ 185] 0 185 2400 0 81920 0 0 dracut-initqueu [ 10.246177] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),global_oom,task_memcg=/,task=systemd-udevd,pid=132,uid=0 [ 10.249123] Out of memory: Kill process 132 (systemd-udevd) score 17 or sacrifice child [ 10.251446] Killed process 132 (systemd-udevd) total-vm:35060kB, anon-rss:400kB, file-rss:4kB, shmem-rss:1692kB [ 11.295270] oom_reaper: unable to reap pid:132 (systemd-udevd) [ 11.296965] task PC stack pid father (...snipped...) [ 12.965253] systemd-udevd R running task 27168 132 128 0x80100004 [ 12.967313] Call Trace: [ 12.968074] __schedule+0x6c0/0x1a00 [ 12.969115] ? __lock_is_held+0xbc/0x140 [ 12.970270] ? pci_mmcfg_check_reserved+0x120/0x120 [ 12.971690] preempt_schedule_common+0x22/0x60 [ 12.973055] _cond_resched+0x1d/0x30 [ 12.974087] __alloc_pages_nodemask+0x3bd/0x610 [ 12.975386] ? __alloc_pages_slowpath+0x2380/0x2380 [ 12.976801] ? kasan_check_read+0x11/0x20 [ 12.978054] __pmd_alloc+0x36/0x370 [ 12.979037] ? __pud_alloc+0x83/0x120 [ 12.980073] copy_page_range+0x1024/0x1af0 [ 12.981183] ? sched_clock_cpu+0x1b/0x170 [ 12.982272] ? sched_clock+0x9/0x10 [ 12.983266] ? find_held_lock+0x40/0x1e0 [ 12.984474] ? check_flags.part.40+0x420/0x420 [ 12.985709] ? vma_gap_callbacks_rotate+0x5a/0x90 [ 12.987059] ? __pmd_alloc+0x370/0x370 [ 12.988112] ? __vma_link_rb+0x1fc/0x340 [ 12.989283] copy_process.part.56+0x2f0e/0x6c80 [ 12.990617] ? __cleanup_sighand+0x40/0x40 [ 12.991724] ? sched_clock_cpu+0x1b/0x170 [ 12.992839] ? find_held_lock+0x40/0x1e0 [ 12.993919] ? check_flags.part.40+0x420/0x420 [ 12.995193] _do_fork+0x15d/0xb90 [ 12.996176] ? __fd_install+0x16c/0x470 [ 12.997216] ? fork_idle+0x250/0x250 [ 12.998252] ? fd_install+0x47/0x60 [ 12.999399] ? do_pipe2+0x102/0x140 [ 13.000389] ? pci_mmcfg_check_reserved+0x120/0x120 [ 13.001735] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 13.003004] ? do_syscall_64+0x18/0x3e0 [ 13.004050] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.005490] __x64_sys_clone+0xba/0x140 [ 13.006786] do_syscall_64+0x8f/0x3e0 [ 13.007781] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.009242] RIP: 0033:0x7f674d010f42 [ 13.010293] Code: f7 d8 64 89 04 25 d4 02 00 00 64 4c 8b 04 25 10 00 00 00 31 d2 4d 8d 90 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 5d 01 00 00 85 c0 41 89 c5 0f 85 67 01 00 [ 13.015179] RSP: 002b:00007ffcf9331600 EFLAGS: 00000246 ORIG_RAX: 0000000000000038 [ 13.017273] RAX: ffffffffffffffda RBX: 00007ffcf9331600 RCX: 00007f674d010f42 [ 13.019208] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011 [ 13.021216] RBP: 00007ffcf9331640 R08: 00007f674e3ef8c0 R09: 0000000000000084 [ 13.023186] R10: 00007f674e3efb90 R11: 0000000000000246 R12: 0000000000000000 [ 13.025061] R13: 0000000000000000 R14: 00007ffcf9333d20 R15: 00007ffcf9333920 (...snipped...) [ 13.249697] Showing all locks held in the system: [ 13.251378] 1 lock held by oom_reaper/18: [ 13.252499] #0: 00000000c8a61e24 (rcu_read_lock){....}, at: debug_show_all_locks+0x5b/0x27e [ 13.254906] 1 lock held by systemd-udevd/128: [ 13.256071] #0: 00000000e09c1ed1 (&mm->mmap_sem){++++}, at: __do_page_fault+0x23a/0x900 [ 13.258336] 2 locks held by systemd-udevd/132: [ 13.259559] #0: 00000000b4432d13 (&mm->mmap_sem){++++}, at: copy_process.part.56+0x23e5/0x6c80 [ 13.261868] #1: 0000000084913324 (&mm->mmap_sem/1){+.+.}, at: copy_process.part.56+0x2408/0x6c80 [ 13.264306] 1 lock held by systemd-detect-/180: [ 13.265584] #0: 000000001cfadba8 (&mm->mmap_sem){++++}, at: __do_page_fault+0x23a/0x900 [ 13.267773] 2 locks held by systemd-journal/189: [ 13.269005] #0: 000000003687636a (&p->lock){+.+.}, at: seq_read+0x66/0x1030 [ 13.270934] #1: 00000000a4d62cb5 (&mm->mmap_sem){++++}, at: __do_page_fault+0x23a/0x900 [ 13.273155] 2 locks held by systemctl/190: [ 13.274390] #0: 000000000f41a6cc (&p->lock){+.+.}, at: seq_read+0x66/0x1030 [ 13.276349] #1: 00000000a1ed5f2f (&mm->mmap_sem){++++}, at: __do_page_fault+0x23a/0x900 [ 13.278527] [ 13.278976] ============================================= [ 13.278976]