Re: [PATCH 3/5] fs: lockless mntns rbtree lookup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 11, 2024 at 09:13:16AM +0100, Christian Brauner wrote:
> Hey Peter,
> 
> I had a question for you and meant to Cc you but forgot. This makes one
> of our rbtree uses lockless using the seqlock pattern. See below.
> 
> I saw that in 50a38035ed5c ("rbtree: provide rb_find_rcu() /
> rb_find_add_rcu()") you added new _rcu() variants.

The original patches are much older, see:

  d72da4a4d973 ("rbtree: Make lockless searches non-fatal")
  ade3f510f93a ("rbtree: Implement generic latch_tree")

(aw gawd, almost 10 years)

> We're using another search function that allows us to walk the tree in
> either direction:
> 
>         guard(read_lock)(&mnt_ns_tree_lock);
>         for (;;) {
>                 struct rb_node *node;
> 
>                 if (previous)
>                         node = rb_prev(&mntns->mnt_ns_tree_node);
>                 else
>                         node = rb_next(&mntns->mnt_ns_tree_node);
>                 if (!node)
>                         return ERR_PTR(-ENOENT);
> 
>                 mntns = node_to_mnt_ns(node);
>                 node = &mntns->mnt_ns_tree_node;
> 
> But afaict neither rb_prev() nor rb_next() are rcu safe. Have you ever
> considered adding rcu safe variants for those two as well?

Urgh, those are hard :-(

So back when I did the lockless lookups, I only ensured the child
pointers were 'stable'. I did not deal with the parent (and colour)
pointer.

The next/prev iterators very much rely on the parent pointer.

Someone would have to go through the tree rotations again and see if it
is possible to also update the parent pointers in such a way as to not
create temporary loops.

Notably, the thing you want to avoid is an interrupt doing a tree
traversal on the CPU that's doing the tree rotation getting stuck.

The other, possibly even harder option would be to (finally) implement
threaded RB trees, where the current NULL child pointers become a 'list'
pointer to the next (in-order) element. But that too makes rotations
'interesting' and must avoid creating loops.

But in all those cases you have to also consider the ramifications of
rb_next/prev hitting a modification, I suspect like with the simple
lookup, you can miss entire subtrees. So depending on the requirements,
this might not be suitable for you.

The far easier option might be to simply add a list_head along with the
rb_node and iterate that -- normally the problem with this is that you
can't easily move elements around in an RCU-list, but luck will have it
you don't move elements around. Your tree location is very static
afaict.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux