Re: [PATCH bpf-next v2 00/13] Add support for qp-trie with dynptr key

"Paul E. McKenney" <paulmck@xxxxxxxxxx> · Sun, 9 Oct 2022 02:05:35 -0700

On Sun, Oct 09, 2022 at 09:09:44AM +0800, Hou Tao wrote:
> Hi Paul,
> 
> On 10/9/2022 4:11 AM, Paul E. McKenney wrote:
> > On Sat, Oct 08, 2022 at 09:40:04AM -0700, Alexei Starovoitov wrote:
> >> On Sat, Oct 8, 2022 at 6:22 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >>> On Fri, Oct 07, 2022 at 06:59:08PM -0700, Alexei Starovoitov wrote:
> SNIP
> >>>>> Understand. I was just trying to understand the exact performance overhead of
> >>>>> call_rcu(). If the overhead of map operations are much greater than the overhead
> >>>>> of call_rcu(), I think calling call_rcu() one millions a second will be not a
> >>>>> problem and  it also makes the implementation of qp-trie being much simpler. The
> >>>>> OOM problem is indeed a problem, although it is also possible for the current
> >>>>> implementation, so I will try to implement the lookup procedure which handles
> >>>>> the reuse problem.
> >>>> call_rcu is not just that particular function.
> >>>> It's all the work rcu subsystem needs to do to observe gp
> >>>> and execute that callback. Just see how many kthreads it will
> >>>> start when overloaded like this.
> >>> The kthreads to watch include rcu_preempt, rcu_sched, ksoftirqd*, rcuc*,
> >>> and rcuo*.  There is also the back-of-interrupt softirq context, which
> >>> requires some care to measure accurately.
> >>>
> >>> The possibility of SLAB_TYPESAFE_BY_RCU has been discussed.  I take it
> >>> that the per-element locking overhead for exact iterations was a problem?
> >>> If so, what exactly are the consistency rules for iteration?  Presumably
> >>> stronger than "if the element existed throughout, it is included in the
> >>> iteration; if it did not exist throughout, it is not included; otherwise
> >>> it might or might not be included" given that you get that for free.
> >>>
> >>> Either way, could you please tell me the exact iteration rules?
> >> The rules are the way we make them to be.
> >> iteration will be under lock.
> >> lookup needs to be correct. It can retry if necessary (like htab is doing).
> >> Randomly returning 'noexist' is, of course, not acceptable.
> > OK, so then it is important that updates to this data structure be
> > carried out in such a way as to avoid discombobulating lockless readers.
> > Do the updates have that property?
> 
> Yes. The update procedure will copy the old pointer array to a new array first,
> then update the new array and replace the pointer of old array by the pointer of
> new array.

Very good.  But then why is there a problem?  Is the iteration using
multiple RCU read-side critical sections or something?

> > The usual way to get that property is to leave the old search structure
> > around, replacing it with the new one, and RCU-freeing the old one.
> > In case it helps, Kung and Lehman describe how to do that for search trees:
> >
> > http://www.eecs.harvard.edu/~htk/publication/1980-tods-kung-lehman.pdf
> Thanks for the paper. Just skimming through it, it seems that it uses
> reference-counting and garbage collection to solve the safe memory reclamation
> problem. It may be too heavy for qp-trie and we plan to use seqcount-like way to
> check whether or not the branch and the leaf node is reused during lookup, and
> retry the lookup if it happened. Now just checking the feasibility of the
> solution and it seems a little complicated than expected.

The main thing in that paper is the handling of rotations in the
search-tree update.  But if you are not using a tree, that won't be all
that relevant.

								Thanx, Paul