Michal Hocko wrote: > On Tue 20-03-18 22:30:16, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > On Tue 20-03-18 21:52:33, Tetsuo Handa wrote: > > > > Michal Hocko wrote: > > > > > A single stack trace in the changelog would be sufficient IMHO. > > > > > Appart from that. What do you expect users will do about this trace? > > > > > Sure they will see a path which holds mmap_sem, we will see a bug report > > > > > but we can hardly do anything about that. We simply cannot drop the lock > > > > > from that path in 99% of situations. So _why_ do we want to add more > > > > > information to the log? > > > > > > > > This case is blocked at i_mmap_lock_write(). > > > > > > But why does i_mmap_lock_write matter for oom_reaping. We are not > > > touching hugetlb mappings. dup_mmap holds mmap_sem for write which is > > > the most probable source of the backoff. > > > > If i_mmap_lock_write can bail out upon SIGKILL, the OOM victim will be able to > > release mmap_sem held for write, which helps the OOM reaper not to back off. > > There are so many other blocking calls (including allocations) in > dup_mmap Yes. But > that I do not really think i_mmap_lock_write is the biggest > problem. That will be likely the case for other mmap_sem write lockers. i_mmap_lock_write() is one of the problems which we could afford fixing. 8 out of 11 "oom_reaper: unable to reap" messages are blocked at i_mmap_lock_write(). [ 226.608508] oom_reaper: unable to reap pid:9261 (a.out) [ 226.611971] a.out D13056 9261 6927 0x00100084 [ 226.615879] Call Trace: [ 226.617926] ? __schedule+0x25f/0x780 [ 226.620559] schedule+0x2d/0x80 [ 226.623356] rwsem_down_write_failed+0x2bb/0x440 [ 226.626426] ? rwsem_down_write_failed+0x55/0x440 [ 226.629458] ? anon_vma_fork+0x124/0x150 [ 226.632679] call_rwsem_down_write_failed+0x13/0x20 [ 226.635884] down_write+0x49/0x60 [ 226.638867] ? copy_process.part.41+0x12f2/0x1fe0 [ 226.642042] copy_process.part.41+0x12f2/0x1fe0 [ 226.645087] ? _do_fork+0xe6/0x560 [ 226.647991] _do_fork+0xe6/0x560 [ 226.650495] ? syscall_trace_enter+0x1a9/0x240 [ 226.653443] ? retint_user+0x18/0x18 [ 226.656601] ? page_fault+0x2f/0x50 [ 226.659159] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 226.662399] do_syscall_64+0x74/0x230 [ 226.664989] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 354.342331] oom_reaper: unable to reap pid:16611 (a.out) [ 354.346707] a.out D13312 16611 6927 0x00100084 [ 354.351173] Call Trace: [ 354.354164] ? __schedule+0x25f/0x780 [ 354.357633] schedule+0x2d/0x80 [ 354.360976] rwsem_down_write_failed+0x2bb/0x440 [ 354.364774] ? rwsem_down_write_failed+0x55/0x440 [ 354.368749] call_rwsem_down_write_failed+0x13/0x20 [ 354.372703] down_write+0x49/0x60 [ 354.376106] ? copy_process.part.41+0x12f2/0x1fe0 [ 354.379956] copy_process.part.41+0x12f2/0x1fe0 [ 354.383791] ? _do_fork+0xe6/0x560 [ 354.386967] _do_fork+0xe6/0x560 [ 354.390016] ? syscall_trace_enter+0x1a9/0x240 [ 354.393756] ? retint_user+0x18/0x18 [ 354.397296] ? page_fault+0x2f/0x50 [ 354.401081] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 354.404772] do_syscall_64+0x74/0x230 [ 354.407749] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 380.525910] oom_reaper: unable to reap pid:6927 (a.out) [ 380.529274] a.out R running task 13544 6927 1 0x00100084 [ 380.533264] Call Trace: [ 380.535433] ? __schedule+0x25f/0x780 [ 380.538052] ? mark_held_locks+0x60/0x80 [ 380.540775] schedule+0x2d/0x80 [ 380.543258] schedule_timeout+0x1a4/0x350 [ 380.546053] ? __next_timer_interrupt+0xd0/0xd0 [ 380.549033] msleep+0x25/0x30 [ 380.551458] shrink_inactive_list+0x5b7/0x690 [ 380.554408] ? __lock_acquire+0x1f1/0xfd0 [ 380.557228] ? find_held_lock+0x2d/0x90 [ 380.559959] shrink_node_memcg+0x340/0x770 [ 380.562776] ? __lock_acquire+0x246/0xfd0 [ 380.565532] ? mem_cgroup_iter+0x121/0x4f0 [ 380.568337] ? mem_cgroup_iter+0x121/0x4f0 [ 380.571203] shrink_node+0xd8/0x370 [ 380.573745] do_try_to_free_pages+0xe3/0x390 [ 380.576759] try_to_free_pages+0xc8/0x110 [ 380.579476] __alloc_pages_slowpath+0x28a/0x8d9 [ 380.582639] __alloc_pages_nodemask+0x21d/0x260 [ 380.585662] new_slab+0x558/0x760 [ 380.588188] ___slab_alloc+0x353/0x6f0 [ 380.590962] ? copy_process.part.41+0x121f/0x1fe0 [ 380.594008] ? find_held_lock+0x2d/0x90 [ 380.596750] ? copy_process.part.41+0x121f/0x1fe0 [ 380.599852] __slab_alloc+0x41/0x7a [ 380.602469] ? copy_process.part.41+0x121f/0x1fe0 [ 380.605607] kmem_cache_alloc+0x1a6/0x1f0 [ 380.608528] copy_process.part.41+0x121f/0x1fe0 [ 380.611613] ? _do_fork+0xe6/0x560 [ 380.614299] _do_fork+0xe6/0x560 [ 380.616752] ? syscall_trace_enter+0x1a9/0x240 [ 380.619813] ? retint_user+0x18/0x18 [ 380.622588] ? page_fault+0x2f/0x50 [ 380.625184] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 380.628841] do_syscall_64+0x74/0x230 [ 380.631660] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 408.397539] oom_reaper: unable to reap pid:6927 (a.out) [ 408.400862] a.out D13544 6927 1 0x00100084 [ 408.404262] Call Trace: [ 408.406449] ? __schedule+0x25f/0x780 [ 408.409063] schedule+0x2d/0x80 [ 408.411440] rwsem_down_write_failed+0x2bb/0x440 [ 408.414544] ? rwsem_down_write_failed+0x55/0x440 [ 408.417684] call_rwsem_down_write_failed+0x13/0x20 [ 408.420866] down_write+0x49/0x60 [ 408.423442] ? copy_process.part.41+0x12f2/0x1fe0 [ 408.426521] copy_process.part.41+0x12f2/0x1fe0 [ 408.429587] ? _do_fork+0xe6/0x560 [ 408.432134] _do_fork+0xe6/0x560 [ 408.434601] ? syscall_trace_enter+0x1a9/0x240 [ 408.437729] ? retint_user+0x18/0x18 [ 408.440391] ? page_fault+0x2f/0x50 [ 408.442955] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 408.446850] do_syscall_64+0x74/0x230 [ 408.449523] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 435.702005] oom_reaper: unable to reap pid:7554 (a.out) [ 435.706759] a.out D13328 7554 6928 0x00100084 [ 435.711417] Call Trace: [ 435.714814] ? __schedule+0x25f/0x780 [ 435.718813] schedule+0x2d/0x80 [ 435.722308] rwsem_down_write_failed+0x2bb/0x440 [ 435.726661] ? rwsem_down_write_failed+0x55/0x440 [ 435.730881] ? anon_vma_fork+0x124/0x150 [ 435.734935] call_rwsem_down_write_failed+0x13/0x20 [ 435.739278] down_write+0x49/0x60 [ 435.743009] ? copy_process.part.41+0x12f2/0x1fe0 [ 435.747173] copy_process.part.41+0x12f2/0x1fe0 [ 435.751304] ? _do_fork+0xe6/0x560 [ 435.754924] _do_fork+0xe6/0x560 [ 435.758538] ? syscall_trace_enter+0x1a9/0x240 [ 435.762533] ? retint_user+0x18/0x18 [ 435.766115] ? page_fault+0x2f/0x50 [ 435.769646] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 435.773850] do_syscall_64+0x74/0x230 [ 435.777364] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 466.269660] oom_reaper: unable to reap pid:7560 (a.out) [ 466.273267] a.out D13120 7560 6928 0x00100084 [ 466.276878] Call Trace: [ 466.279146] ? __schedule+0x25f/0x780 [ 466.281945] schedule+0x2d/0x80 [ 466.284421] rwsem_down_write_failed+0x2bb/0x440 [ 466.287748] ? rwsem_down_write_failed+0x55/0x440 [ 466.291012] ? anon_vma_fork+0x124/0x150 [ 466.293990] call_rwsem_down_write_failed+0x13/0x20 [ 466.297344] down_write+0x49/0x60 [ 466.299925] ? copy_process.part.41+0x12f2/0x1fe0 [ 466.303310] copy_process.part.41+0x12f2/0x1fe0 [ 466.306504] ? _do_fork+0xe6/0x560 [ 466.309257] _do_fork+0xe6/0x560 [ 466.311911] ? syscall_trace_enter+0x1a9/0x240 [ 466.315112] ? retint_user+0x18/0x18 [ 466.317937] ? page_fault+0x2f/0x50 [ 466.320667] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 466.324043] do_syscall_64+0x74/0x230 [ 466.326999] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 495.621196] oom_reaper: unable to reap pid:7563 (a.out) [ 495.624785] a.out D12992 7563 6928 0x00100084 [ 495.628398] Call Trace: [ 495.630634] ? __schedule+0x25f/0x780 [ 495.633464] schedule+0x2d/0x80 [ 495.635952] rwsem_down_write_failed+0x2bb/0x440 [ 495.639169] ? rwsem_down_write_failed+0x55/0x440 [ 495.642432] ? anon_vma_fork+0x124/0x150 [ 495.645216] call_rwsem_down_write_failed+0x13/0x20 [ 495.648651] down_write+0x49/0x60 [ 495.651446] ? copy_process.part.41+0x12f2/0x1fe0 [ 495.654843] copy_process.part.41+0x12f2/0x1fe0 [ 495.658031] ? _do_fork+0xe6/0x560 [ 495.660718] _do_fork+0xe6/0x560 [ 495.663415] ? syscall_trace_enter+0x1a9/0x240 [ 495.666662] ? retint_user+0x18/0x18 [ 495.669369] ? page_fault+0x2f/0x50 [ 495.672162] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 495.675529] do_syscall_64+0x74/0x230 [ 495.678418] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 536.271631] oom_reaper: unable to reap pid:6928 (a.out) [ 536.275197] a.out D12992 6928 1 0x00100084 [ 536.278972] Call Trace: [ 536.281110] ? __schedule+0x25f/0x780 [ 536.283883] schedule+0x2d/0x80 [ 536.286593] rwsem_down_write_failed+0x2bb/0x440 [ 536.289821] ? rwsem_down_write_failed+0x55/0x440 [ 536.293113] ? anon_vma_fork+0x124/0x150 [ 536.296262] call_rwsem_down_write_failed+0x13/0x20 [ 536.299554] down_write+0x49/0x60 [ 536.302514] ? copy_process.part.41+0x12f2/0x1fe0 [ 536.305816] copy_process.part.41+0x12f2/0x1fe0 [ 536.309032] ? _do_fork+0xe6/0x560 [ 536.312206] _do_fork+0xe6/0x560 [ 536.314810] ? syscall_trace_enter+0x1a9/0x240 [ 536.318335] ? retint_user+0x18/0x18 [ 536.321064] ? page_fault+0x2f/0x50 [ 536.323757] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 536.327396] do_syscall_64+0x74/0x230 [ 536.330175] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 600.285293] oom_reaper: unable to reap pid:7550 (a.out) [ 600.289112] a.out D13104 7550 6931 0x00100084 [ 600.292918] Call Trace: [ 600.295506] ? __schedule+0x25f/0x780 [ 600.298614] ? __lock_acquire+0x246/0xfd0 [ 600.301879] schedule+0x2d/0x80 [ 600.304623] schedule_timeout+0x1fd/0x350 [ 600.307633] ? find_held_lock+0x2d/0x90 [ 600.310909] ? mark_held_locks+0x60/0x80 [ 600.314001] ? _raw_spin_unlock_irq+0x24/0x30 [ 600.317382] wait_for_completion+0xab/0x130 [ 600.320579] ? wake_up_q+0x70/0x70 [ 600.323419] flush_work+0x1bd/0x260 [ 600.326434] ? flush_work+0x174/0x260 [ 600.329332] ? destroy_worker+0x90/0x90 [ 600.332301] drain_all_pages+0x16d/0x1e0 [ 600.335464] __alloc_pages_slowpath+0x443/0x8d9 [ 600.338743] __alloc_pages_nodemask+0x21d/0x260 [ 600.342178] new_slab+0x558/0x760 [ 600.344861] ___slab_alloc+0x353/0x6f0 [ 600.347744] ? copy_process.part.41+0x121f/0x1fe0 [ 600.351213] ? find_held_lock+0x2d/0x90 [ 600.354119] ? copy_process.part.41+0x121f/0x1fe0 [ 600.357491] __slab_alloc+0x41/0x7a [ 600.360206] ? copy_process.part.41+0x121f/0x1fe0 [ 600.363480] kmem_cache_alloc+0x1a6/0x1f0 [ 600.366579] copy_process.part.41+0x121f/0x1fe0 [ 600.369761] ? _do_fork+0xe6/0x560 [ 600.372382] _do_fork+0xe6/0x560 [ 600.375142] ? syscall_trace_enter+0x1a9/0x240 [ 600.378258] ? retint_user+0x18/0x18 [ 600.380978] ? page_fault+0x2f/0x50 [ 600.383814] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 600.387096] do_syscall_64+0x74/0x230 [ 600.389928] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) [ 664.892292] a.out D13272 7558 6931 0x00100084 [ 664.895765] Call Trace: [ 664.897574] ? __schedule+0x25f/0x780 [ 664.900099] schedule+0x2d/0x80 [ 664.902260] rwsem_down_write_failed+0x2bb/0x440 [ 664.905249] ? rwsem_down_write_failed+0x55/0x440 [ 664.908335] ? free_pgd_range+0x569/0x5e0 [ 664.911145] call_rwsem_down_write_failed+0x13/0x20 [ 664.914121] down_write+0x49/0x60 [ 664.916519] ? unlink_file_vma+0x28/0x50 [ 664.919255] unlink_file_vma+0x28/0x50 [ 664.922234] free_pgtables+0x36/0x100 [ 664.924797] exit_mmap+0xbb/0x180 [ 664.927220] mmput+0x50/0x110 [ 664.929504] copy_process.part.41+0xb61/0x1fe0 [ 664.932448] ? _do_fork+0xe6/0x560 [ 664.934902] ? _do_fork+0xe6/0x560 [ 664.937361] _do_fork+0xe6/0x560 [ 664.939742] ? syscall_trace_enter+0x1a9/0x240 [ 664.942693] ? retint_user+0x18/0x18 [ 664.945309] ? page_fault+0x2f/0x50 [ 664.947896] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 664.951075] do_syscall_64+0x74/0x230 [ 664.953747] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 744.980868] oom_reaper: unable to reap pid:6928 (a.out) [ 744.984653] a.out D12992 6928 1 0x00100084 [ 744.988278] Call Trace: [ 744.990524] ? __schedule+0x25f/0x780 [ 744.993435] schedule+0x2d/0x80 [ 744.995877] rwsem_down_write_failed+0x2bb/0x440 [ 744.999272] ? rwsem_down_write_failed+0x55/0x440 [ 745.002759] ? anon_vma_fork+0x124/0x150 [ 745.005789] call_rwsem_down_write_failed+0x13/0x20 [ 745.009324] down_write+0x49/0x60 [ 745.012134] ? copy_process.part.41+0x12f2/0x1fe0 [ 745.015574] copy_process.part.41+0x12f2/0x1fe0 [ 745.018845] ? _do_fork+0xe6/0x560 [ 745.021698] _do_fork+0xe6/0x560 [ 745.024275] ? syscall_trace_enter+0x1a9/0x240 [ 745.027443] ? retint_user+0x18/0x18 [ 745.030281] ? page_fault+0x2f/0x50 [ 745.033138] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 745.036703] do_syscall_64+0x74/0x230 [ 745.039476] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > Really I am not sure dumping more information is beneficial here. Converting to use killable where we can afford is beneficial.