On Sun, Oct 09, 2022 at 06:45:22PM +0800, Hou Tao wrote: > Hi, > On 10/9/2022 5:05 PM, Paul E. McKenney wrote: > > On Sun, Oct 09, 2022 at 09:09:44AM +0800, Hou Tao wrote: > >> Hi Paul, > >> > >> On 10/9/2022 4:11 AM, Paul E. McKenney wrote: > >>> On Sat, Oct 08, 2022 at 09:40:04AM -0700, Alexei Starovoitov wrote: > >>>> On Sat, Oct 8, 2022 at 6:22 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > >>>>> On Fri, Oct 07, 2022 at 06:59:08PM -0700, Alexei Starovoitov wrote: > >> SNIP > >>>>>>> Understand. I was just trying to understand the exact performance overhead of > >>>>>>> call_rcu(). If the overhead of map operations are much greater than the overhead > >>>>>>> of call_rcu(), I think calling call_rcu() one millions a second will be not a > >>>>>>> problem and it also makes the implementation of qp-trie being much simpler. The > >>>>>>> OOM problem is indeed a problem, although it is also possible for the current > >>>>>>> implementation, so I will try to implement the lookup procedure which handles > >>>>>>> the reuse problem. > >>>>>> call_rcu is not just that particular function. > >>>>>> It's all the work rcu subsystem needs to do to observe gp > >>>>>> and execute that callback. Just see how many kthreads it will > >>>>>> start when overloaded like this. > >>>>> The kthreads to watch include rcu_preempt, rcu_sched, ksoftirqd*, rcuc*, > >>>>> and rcuo*. There is also the back-of-interrupt softirq context, which > >>>>> requires some care to measure accurately. > >>>>> > >>>>> The possibility of SLAB_TYPESAFE_BY_RCU has been discussed. I take it > >>>>> that the per-element locking overhead for exact iterations was a problem? > >>>>> If so, what exactly are the consistency rules for iteration? Presumably > >>>>> stronger than "if the element existed throughout, it is included in the > >>>>> iteration; if it did not exist throughout, it is not included; otherwise > >>>>> it might or might not be included" given that you get that for free. > >>>>> > >>>>> Either way, could you please tell me the exact iteration rules? > >>>> The rules are the way we make them to be. > >>>> iteration will be under lock. > >>>> lookup needs to be correct. It can retry if necessary (like htab is doing). > >>>> Randomly returning 'noexist' is, of course, not acceptable. > >>> OK, so then it is important that updates to this data structure be > >>> carried out in such a way as to avoid discombobulating lockless readers. > >>> Do the updates have that property? > >> Yes. The update procedure will copy the old pointer array to a new array first, > >> then update the new array and replace the pointer of old array by the pointer of > >> new array. > > Very good. But then why is there a problem? Is the iteration using > > multiple RCU read-side critical sections or something? > > The problem is that although the objects are RCU-freed, but these object also > can be reused immediately in bpf memory allocator. The reason for reuse is for > performance and is to reduce the possibility of OOM. Because the object can be > reused during RCU-protected lookup and the possibility of reuse is low, the > lookup procedure needs to check whether reuse is happening during lookup. And I > was arguing with Alexei about whether or no it is reasonable to provide an > optional flag to remove the immediate reuse in bpf memory allocator. Indeed, in that case there needs to be a check, for example as described in the comment preceding the definition of SLAB_TYPESAFE_BY_RCU. If the use of the element is read-only on the one hand or heuristic/statistical on the other, lighter weight approaches are possible. Would that help? Thanx, Paul > >>> The usual way to get that property is to leave the old search structure > >>> around, replacing it with the new one, and RCU-freeing the old one. > >>> In case it helps, Kung and Lehman describe how to do that for search trees: > >>> > >>> http://www.eecs.harvard.edu/~htk/publication/1980-tods-kung-lehman.pdf > >> Thanks for the paper. Just skimming through it, it seems that it uses > >> reference-counting and garbage collection to solve the safe memory reclamation > >> problem. It may be too heavy for qp-trie and we plan to use seqcount-like way to > >> check whether or not the branch and the leaf node is reused during lookup, and > >> retry the lookup if it happened. Now just checking the feasibility of the > >> solution and it seems a little complicated than expected. > > The main thing in that paper is the handling of rotations in the > > search-tree update. But if you are not using a tree, that won't be all > > that relevant. > I see. Thanks for the explanation. > > > > Thanx, Paul >