Re: [PATCH bpf v1 2/3] bpf: Don't reinit map value in prealloc_lru_pop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 8, 2022 at 5:25 PM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote:
> >
> > Thinking again. I guess the following scenario is possible:
> >
> >       rcu_read_lock()
> >          v = bpf_map_lookup_elem(&key);
> >          t1 = v->field;
> >          bpf_map_delete_elem(&key);
> >             /* another bpf program triggering bpf_map_update_elem() */
> >             /* the element with 'key' is reused */
> >             /* the value is updated */
> >          t2 = v->field;
> >          ...
> >       rcu_read_unlock()
> >
> > it is possible t1 and t2 may not be the same.
> > This should be an extremely corner case, not sure how to resolve
> > this easily without performance degradation.
>
> Yes, this is totally possible. This extreme corner case may also
> become a real problem in sleepable programs ;-).
>
> I had an idea in mind on how it can be done completely in the BPF
> program side without changing the map implementation.

+1 it's best to address it inside the program instead
of complicating and likely slowing down maps.

As far as sleepable progs.. the issue is not present
because sleepable progs are only allowed to use preallocated maps.
In non-prealloc maps free_htab_elem() would just call_rcu()
and the element would be freed into the global slab which
will be bad if bpf prog is doing something like above.

Doing call_rcu_tasks_trace() + call_rcu() would allow
non-prealloc in sleepable, but it doesn't scale
and non-prealloc is already many times slower comparing
to prealloc due to atomic_inc/dec and call_rcu.

With new bpf_mem_alloc I'm experimenting with batching
of call_rcu_tasks_trace + call_rcu, so hash map can be
dynamically allocated, just as fast as full prealloc and
finally safe in both sleepable and traditional progs.
I was thinking of adding a new SLAB_TYPESAFE_BY_RCU_TASKS_TRACE
flag to the slab, but decided to go that route only if
batching of call_rcu*() won't be sufficient.
Sorry it takes so long to get the patches ready.
Summer in WA is the best time to take vacation :)
I've been taking plenty.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux