Re: [PATCH bpf v2 7/9] bpf: Use raw_spinlock_t for LPM trie

Toke Høiland-Jørgensen <toke@xxxxxxxxxx> · Sun, 15 Dec 2024 17:51:06 +0100

Hou Tao <houtao@xxxxxxxxxxxxxxx> writes:

> Hi,
>
> On 12/5/2024 5:47 PM, Toke Høiland-Jørgensen wrote:
>> Hou Tao <houtao@xxxxxxxxxxxxxxx> writes:
>>
>>> Hi,
>>>
>>> On 12/3/2024 9:42 AM, Alexei Starovoitov wrote:
>>>> On Fri, Nov 29, 2024 at 4:18 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>>>>> Hou Tao <houtao@xxxxxxxxxxxxxxx> writes:
>>>>>
>>>>>> From: Hou Tao <houtao1@xxxxxxxxxx>
>>>>>>
>>>>>> After switching from kmalloc() to the bpf memory allocator, there will be
>>>>>> no blocking operation during the update of LPM trie. Therefore, change
>>>>>> trie->lock from spinlock_t to raw_spinlock_t to make LPM trie usable in
>>>>>> atomic context, even on RT kernels.
>>>>>>
>>>>>> The max value of prefixlen is 2048. Therefore, update or deletion
>>>>>> operations will find the target after at most 2048 comparisons.
>>>>>> Constructing a test case which updates an element after 2048 comparisons
>>>>>> under a 8 CPU VM, and the average time and the maximal time for such
>>>>>> update operation is about 210us and 900us.
>>>>> That is... quite a long time? I'm not sure we have any guidance on what
>>>>> the maximum acceptable time is (perhaps the RT folks can weigh in
>>>>> here?), but stalling for almost a millisecond seems long.
>>>>>
>>>>> Especially doing this unconditionally seems a bit risky; this means that
>>>>> even a networking program using the lpm map in the data path can stall
>>>>> the system for that long, even if it would have been perfectly happy to
>>>>> be preempted.
>>>> I don't share this concern.
>>>> 2048 comparisons is an extreme case.
>>>> I'm sure there are a million other ways to stall bpf prog for that long.
>>> 2048 is indeed an extreme case. I would do some test to check how much
>>> time is used for the normal cases with prefixlen=32 or prefixlen=128.
>> That would be awesome, thanks!
>
> Sorry for the long delay. After apply patch set v3, the avg and max time
> for prefixlen = 32 and prefix =128 is about 2.3/4, 7.7/11 us respectively.

Ah, excellent. With those numbers, my worries about this introducing
accidental latency spikes are much assuaged. Thanks for following up! :)

-Toke