Hi, On (09/13/17 18:56), Laurent Dufour wrote: > Hi Sergey, > > On 13/09/2017 13:53, Sergey Senozhatsky wrote: > > Hi, > > > > On (09/08/17 20:06), Laurent Dufour wrote: [..] > > ok, so what I got on my box is: > > > > vm_munmap() -> down_write_killable(&mm->mmap_sem) > > do_munmap() > > __split_vma() > > __vma_adjust() -> write_seqcount_begin(&vma->vm_sequence) > > -> write_seqcount_begin_nested(&next->vm_sequence, SINGLE_DEPTH_NESTING) > > > > so this gives 3 dependencies ->mmap_sem -> ->vm_seq > > ->vm_seq -> ->vm_seq/1 > > ->mmap_sem -> ->vm_seq/1 > > > > > > SyS_mremap() -> down_write_killable(¤t->mm->mmap_sem) > > move_vma() -> write_seqcount_begin(&vma->vm_sequence) > > -> write_seqcount_begin_nested(&new_vma->vm_sequence, SINGLE_DEPTH_NESTING); > > move_page_tables() > > __pte_alloc() > > pte_alloc_one() > > __alloc_pages_nodemask() > > fs_reclaim_acquire() > > > > > > I think here we have prepare_alloc_pages() call, that does > > > > -> fs_reclaim_acquire(gfp_mask) > > -> fs_reclaim_release(gfp_mask) > > > > so that adds one more dependency ->mmap_sem -> ->vm_seq -> fs_reclaim > > ->mmap_sem -> ->vm_seq/1 -> fs_reclaim > > > > > > now, under memory pressure we hit the slow path and perform direct > > reclaim. direct reclaim is done under fs_reclaim lock, so we end up > > with the following call chain > > > > __alloc_pages_nodemask() > > __alloc_pages_slowpath() > > __perform_reclaim() -> fs_reclaim_acquire(gfp_mask); > > try_to_free_pages() > > shrink_node() > > shrink_active_list() > > rmap_walk_file() -> i_mmap_lock_read(mapping); > > > > > > and this break the existing dependency. since we now take the leaf lock > > (fs_reclaim) first and the the root lock (->mmap_sem). > > Thanks for looking at this. > I'm sorry, I should have miss something. no prob :) > My understanding is that there are 2 chains of locks: > 1. from __vma_adjust() mmap_sem -> i_mmap_rwsem -> vm_seq > 2. from move_vmap() mmap_sem -> vm_seq -> fs_reclaim > 2. from __alloc_pages_nodemask() fs_reclaim -> i_mmap_rwsem yes, as far as lockdep warning suggests. > So the solution would be to have in __vma_adjust() > mmap_sem -> vm_seq -> i_mmap_rwsem > > But this will raised the following dependency from unmap_mapping_range() > unmap_mapping_range() -> i_mmap_rwsem > unmap_mapping_range_tree() > unmap_mapping_range_vma() > zap_page_range_single() > unmap_single_vma() > unmap_page_range() -> vm_seq > > And there is no way to get rid of it easily as in unmap_mapping_range() > there is no VMA identified yet. > > That's being said I can't see any clear way to get lock dependency cleaned > here. > Furthermore, this is not clear to me how a deadlock could happen as vm_seq > is a sequence lock, and there is no way to get blocked here. as far as I understand, seq locks can deadlock, technically. not on the write() side, but on the read() side: read_seqcount_begin() raw_read_seqcount_begin() __read_seqcount_begin() and __read_seqcount_begin() spins for ever __read_seqcount_begin() { repeat: ret = READ_ONCE(s->sequence); if (unlikely(ret & 1)) { cpu_relax(); goto repeat; } return ret; } so if there are two CPUs, one doing write_seqcount() and the other one doing read_seqcount() then what can happen is something like this CPU0 CPU1 fs_reclaim_acquire() write_seqcount_begin() fs_reclaim_acquire() read_seqcount_begin() write_seqcount_end() CPU0 can't write_seqcount_end() because of fs_reclaim_acquire() from CPU1, CPU1 can't read_seqcount_begin() because CPU0 did write_seqcount_begin() and now waits for fs_reclaim_acquire(). makes sense? -ss -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>