Re: [RFC PATCH bpf-next 0/3] Add support for qp-trie map

Toke Høiland-Jørgensen <toke@xxxxxxxxxx> · Mon, 08 Aug 2022 19:54:46 +0200

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> writes:

> On Tue, Jul 26, 2022 at 5:42 AM Hou Tao <houtao1@xxxxxxxxxx> wrote:
>>
>> Hi,
>>
>
> Hey, sorry I'm catching up on upstream and there is just too many
> complicated patch sets coming in, so it takes time to get through
> them.
>
> I think you did a great job with this implementation, it's certainly
> worth submitting as non-RFC for proper upstream review. I know that
> some people didn't get your patches, they got into spam somehow. So I
> think it would be great to just resubmit it as non-RFC so that it
> appears in patchworks as to-be-reviewew patches and hopefully will get
> a wider audience to review this.
>
> I've tried to answer some questions below, but would definitely like
> more people to chime in. I haven't went through implementation in
> details, but superficially it looks pretty clean and certainly ready
> for proper non-RFC review.
>
> One point about user API would be to maybe instead use bpf_dynptr as
> an interface for specifying variable-sized lookup key instead of
> hard-coded
>  bpf_qp_trie_key. Please check recent work by Joanne on bpf_dynptr.
>
> In short: looks great, I think it's certainly worth adding this as BPF
> map type. Please submit as non-RFC and go through a proper review
> process. Looking forward (even if that means reviewing 1000k lines of
> dense algorithmic code :) ).
>
>> The initial motivation for qp-trie map is to reduce memory usage for
>> string keys special those with large differencies in length as
>> discussed in [0]. And as a big-endian lexicographical ordered map, it
>> can also be used for any binary data with fixed or variable length.
>>
>> Now the basic functionality of qp-trie is ready, so posting a RFC version
>> to get more feedback or suggestions about qp-trie. Specially feedback
>> about the following questions:
>>
>> (1) Application scenario for qp-trie
>> Andrii had proposed to re-implement lpm-trie by using qp-trie. The
>> advantage would be the speed up of lookup operations due to lower tree
>> depth of qp-trie. Maybe the performance of update could also be improved
>> although in cillium there is a big lock during lpm-trie update [1]. Is
>
> Well, using qp-trie approach as an internal implementation of lpm-trie
> is probably a major win already, what's wrong with that?

+1, just improving the LPM trie map type is definitely worthwhile!

>> there any other use cases for qp-trie ? Specially those cases which need
>> both ordering and memory efficiency or cases in which jhash() of htab
>> creates too much collisions and qp-trie lookup performances better than
>> hash-table lookup as shown below:
>
> I'm thinking about qp-trie as dynamically growable lookup table.
> That's a pretty big use case already. There is an RB tree proposal
> under review right now which would also satisfy dynamically growable
> criteria, but its focus is slightly different and it remains to be
> seen how convenient it will be as general-purpose resizable
> alternative to hashmap. But from benchmarks I've found online, RB tree
> will definitely use more memory than qp-trie.
>
> Ordered property seems also very useful, but I don't yet have specific
> use case for that. But once we have data structure like this in BPF,
> I'm sure use cases will pop up.

>From the various discussions of the packet queueing schemes (for both
XDP and sch_bpf), it seems some sort of ordered data structure is going
to be useful for that. Not sure this is a great fit for that use case,
but may be worth trying out. Looking harder at this is close to the top
of my list, so will play around with this series as well :)

-Toke