On Thu, Aug 15, 2019 at 09:35:26PM +0200, Michal Hocko wrote: > > The last detail is I'm still unclear what a GFP flags a blockable > > invalidate_range_start() should use. Is GFP_KERNEL OK? > > I hope I will not make this muddy again ;) > invalidate_range_start in the blockable mode can use/depend on any sleepable > allocation allowed in the context it is called from. 'in the context is is called from' is the magic phrase, as invalidate_range_start is called while holding several different mm related locks. I know at least write mmap_sem and i_mmap_rwsem (write?) Can GFP_KERNEL be called while holding those locks? This is the question of indirect dependency on reclaim via locks you raised earlier. > So in other words it is no different from any other function in the > kernel that calls into allocator. As the API is missing gfp context > then I hope it is not called from any restricted contexts (except > from the oom which we have !blockable for). Yes, the callers are exactly my concern. > > Lockdep has > > complained on that in past due to fs_reclaim - how do you know if it > > is a false positive? > > I would have to see the specific lockdep splat. See below. I found it when trying to understand why the registration of the mmu notififer was so oddly coded. The situation was: down_write(&mm->mmap_sem); mm_take_all_locks(mm); kmalloc(GFP_KERNEL); <--- lockdep warning I understood Daniel said he saw this directly on a recent kernel when working with his lockdep patch? Checking myself, on todays kernel I see a call chain: shrink_all_memory fs_reclaim_acquire(sc.gfp_mask); [..] do_try_to_free_pages shrink_zones shrink_node shrink_node_memcg shrink_list shrink_active_list page_referenced rmap_walk rmap_walk_file i_mmap_lock_read down_read(i_mmap_rwsem) So it is possible that the down_read() above will block on i_mmap_rwsem being held in the caller of invalidate_range_start which is doing kmalloc(GPF_KERNEL). Is this OK? The lockdep annotation says no.. Jason commit 35cfa2b0b491c37e23527822bf365610dbb188e5 Author: Gavin Shan <shangw@xxxxxxxxxxxxxxxxxx> Date: Thu Oct 25 13:38:01 2012 -0700 mm/mmu_notifier: allocate mmu_notifier in advance While allocating mmu_notifier with parameter GFP_KERNEL, swap would start to work in case of tight available memory. Eventually, that would lead to a deadlock while the swap deamon swaps anonymous pages. It was caused by commit e0f3c3f78da29b ("mm/mmu_notifier: init notifier if necessary"). ================================= [ INFO: inconsistent lock state ] 3.7.0-rc1+ #518 Not tainted --------------------------------- inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. kswapd0/35 [HC0[0]:SC0[0]:HE1:SE1] takes: (&mapping->i_mmap_mutex){+.+.?.}, at: page_referenced+0x9c/0x2e0 {RECLAIM_FS-ON-W} state was registered at: mark_held_locks+0x86/0x150 lockdep_trace_alloc+0x67/0xc0 kmem_cache_alloc_trace+0x33/0x230 do_mmu_notifier_register+0x87/0x180 mmu_notifier_register+0x13/0x20 kvm_dev_ioctl+0x428/0x510 do_vfs_ioctl+0x98/0x570 sys_ioctl+0x91/0xb0 system_call_fastpath+0x16/0x1b