* John Hsu (許永翰) <John.Hsu@xxxxxxxxxxxx> [230807 05:55]: > On Wed, 2023-07-19 at 14:51 -0400, Liam R. Howlett wrote: ... > > > As I know, following is rb_tree flow in 5.15.186: > > > > > > ... > > > mmap_write_lock_killable(mm) > > > ... > > > do_mmap() > > > ... > > > mmap_region() > > > ... > > > vm_area_alloc(mm) > > > ... > > > mmap_write_unlock(mm) > > > > > > vm_area_alloc is in the mmap_lock hoding period. > > > It seems that the flow would sleep here in rb_tree flow. > > > If I miss anything, please tell me, thanks! > > > > Before the mmap_write_unlock(mm) in the above sequence, the > > i_mmap_lock_write(), anon_vma_lock_write(), and/or the > > flush_dcache_mmap_lock() may be taken. Check __vma_adjust(). > > > > The insertion into the tree needs to hold some subset of these locks. > > The rb-tree insert did not allocate within these locks, but the maple > > tree would need to allocate within these locks to insert into the > > tree. > > This is why the preallocation exists and why it is necessary. > > > > Yap, preallocation is necessary. anon_vma_lock_write() and > flush_dcache_mmap_lock() hold the lock and manipulate rb_tree. I think > that there is no maple tree manipulations during the lock holding > period. Is there any future work in this section? __vma_adjust() does modify the maple tree during the lock holding section through vma_mas_store() in 6.1. Prior to 6.1, there is no maple tree. ... > > There are also config options to debug the tree operations, but they > > do > > not detect the redundant write issues. Perhaps I can look at adding > > support for detecting redundant writes, but that will not be > > backported > > to a stable kernel. > > > > The sufficient test cases of maple tree ensure the function work well. > But the redundant operations (alloc node, free node, tree > manipulations) of maple_tree are not easy to detect (e.g. the case > reported this time and mas_preallocate() allocates redundant nodes with > the worst case). > > The detecting redundant writes mechanism may help the developers to > find out the problems easier. Hope it can be establised successfully!! When I went to add this, I had found I already added it here [1]. This operation was not caught by MA_STATE_PREALLOC because there are two writes before a mas_destroy(), so there may be nodes left which avoid the warning. I'll look at improving this situation. Thanks, Liam [1] https://lore.kernel.org/linux-mm/20220722160546.1478722-2-Liam.Howlett@xxxxxxxxxx/