Re: [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 1/1/2023 9:26 AM, Alexei Starovoitov wrote:
> On Fri, Dec 30, 2022 at 12:11:45PM +0800, Hou Tao wrote:
>> From: Hou Tao <houtao1@xxxxxxxxxx>
>>
>> Hi,
>>
>> The patchset tries to fix the problems found when checking how htab map
>> handles element reuse in bpf memory allocator. The immediate reuse of
>> freed elements may lead to two problems in htab map:
>>
>> (1) reuse will reinitialize special fields (e.g., bpf_spin_lock) in
>>     htab map value and it may corrupt lookup procedure with BFP_F_LOCK
>>     flag which acquires bpf-spin-lock during value copying. The
>>     corruption of bpf-spin-lock may result in hard lock-up.
>> (2) lookup procedure may get incorrect map value if the found element is
>>     freed and then reused.
>>
>> Because the type of htab map elements are the same, so problem #1 can be
>> fixed by supporting ctor in bpf memory allocator. The ctor initializes
>> these special fields in map element only when the map element is newly
>> allocated. If it is just a reused element, there will be no
>> reinitialization.
> Instead of adding the overhead of ctor callback let's just
> add __GFP_ZERO to flags in __alloc().
> That will address the issue 1 and will make bpf_mem_alloc behave just
> like percpu_freelist, so hashmap with BPF_F_NO_PREALLOC and default
> will behave the same way.
Use __GPF_ZERO will be simpler, but the overhead of memset() for every allocated
object may be bigger than ctor callback when the size of allocated object is
large. And it also introduces unnecessary memory zeroing if there is no special
field in the map value.

>> Problem #2 exists for both non-preallocated and preallocated htab map.
>> By adding seq in htab element, doing reuse check and retrying the
>> lookup procedure may be a feasible solution, but it will make the
>> lookup API being hard to use, because the user needs to check whether
>> the found element is reused or not and repeat the lookup procedure if it
>> is reused. A simpler solution would be just disabling freed elements
>> reuse and freeing these elements after lookup procedure ends.
> You've proposed this 'solution' twice already in qptrie thread and both
> times the answer was 'no, we cannot do this' with reasons explained.
> The 3rd time the answer is still the same.
This time a workable demo which calls call_rcu_task_trace() in batch is provided
:) Also because I can not find a better solution for the reuse problem. But you
are right, although don't reuse the freed element will make the implementation
of map simpler, the potential OOM problem is hard to solve specially when RCU
tasks trace grace period is slow. Hope Paul can provide some insight about the
problem.
> This 'issue 2' existed in hashmap since very beginning for many years.
> It's a known quirk. There is nothing to fix really.
Do we need to document the unexpected behavior somewhere, because I really don't
know nothing about the quirk ?
>
> The graph apis (aka new gen data structs) with link list and rbtree are
> in active development. Soon bpf progs will be able to implement their own
> hash maps with explicit bpf_rcu_read_lock. At that time the progs will
> be making the trade off between performance and lookup/delete race.
It seems these new gen data struct also need to solve the reuse problem because
a global bpf memory allocator is used.
> So please respin with just __GFP_ZERO and update the patch 6
> to check for lockup only.
> .




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux