Re: [PATCH bpf 05/11] bpf: Add bpf_map_of_map_fd_{get,put}_ptr() helpers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 13, 2023 at 08:53:06AM +0800, Hou Tao wrote:
> Hi,
> 
> On 11/10/2023 12:58 PM, Paul E. McKenney wrote:
> > On Fri, Nov 10, 2023 at 11:34:03AM +0800, Hou Tao wrote:
> >> Hi Martin,
> >>
> >> On 11/10/2023 10:48 AM, Martin KaFai Lau wrote:
> >>> On 11/9/23 5:45 PM, Paul E. McKenney wrote:
> >>>>>>>>>> +void bpf_map_of_map_fd_put_ptr(void *ptr, bool need_defer)
> >>>>>>>>>> +{
> >>>>>>>>>> +    struct bpf_inner_map_element *element = ptr;
> >>>>>>>>>> +
> >>>>>>>>>> +    /* Do bpf_map_put() after a RCU grace period and a tasks
> >>>>>>>>>> trace
> >>>>>>>>>> +     * RCU grace period, so it is certain that the bpf program
> >>>>>>>>>> which is
> >>>>>>>>>> +     * manipulating the map now has exited when bpf_map_put() is
> >>>>>>>>>> called.
> >>>>>>>>>> +     */
> >>>>>>>>>> +    if (need_defer)
> >>>>>>>>> "need_defer" should only happen from the syscall cmd? Instead of
> >>>>>>>>> adding rcu_head to each element, how about
> >>>>>>>>> "synchronize_rcu_mult(call_rcu, call_rcu_tasks)" here?
> >>>>>>>> No. I have tried the method before, but it didn't work due to
> >>>>>>>> dead-lock
> >>>>>>>> (will mention that in commit message in v2). The reason is that bpf
> >>>>>>>> syscall program may also do map update through sys_bpf helper.
> >>>>>>>> Because
> >>>>>>>> bpf syscall program is running with sleep-able context and has
> >>>>>>>> rcu_read_lock_trace being held, so call
> >>>>>>>> synchronize_rcu_mult(call_rcu,
> >>>>>>>> call_rcu_tasks) will lead to dead-lock.
> >>> Need to think of a less intrusive solution instead of adding rcu_head
> >>> to each element and lookup also needs an extra de-referencing.
> >> I see.
> >>> May be the bpf_map_{update,delete}_elem(&outer_map, ....) should not
> >>> be done by the syscall program? Which selftest does it?
> >> Now bpf_map_update_elem is allowed for bpf_sys_bpf helper. If I
> >> remembered correctly it was map_ptr.
> >>> Can the inner_map learn that it has been deleted from an outer map
> >>> that is used in a sleepable prog->aux->used_maps? The
> >>> bpf_map_free_deferred() will then wait for a task_trace gp?
> >> Considering an inner_map may be used by multiple outer_map, the
> >> following solution will be simpler: if the inner map has been deleted
> >> from an outer map once, its free must be delayed after one RCU GP and
> >> one tasks trace RCU GP. But I will check whether it is possible to only
> >> wait for one RCU GP instead of two.
> > If you are freeing a large quantity of elements at a time, one approach
> > is to use a single rcu_head structure for the group.  (Or, in this case,
> > maybe a pair of rcu_head structures, one for call_rcu() and the other
> > for call_rcu_tasks_trace().)
> >
> > This requires that you be able to link the elements in the group
> > together somehow, which requires some per-element storage, but only
> > one word per element instead of two.
> >
> > There are other variations on this theme, depending on what constraints
> > apply here.
> 
> Thanks for your suggestions. Although there are batch update support for
> inner map, but I think inner map is updated one-by-one at most case. And
> the main concern here is the extra dereference due to memory allocation,
> so I think adding extra flags to indicate bpf_mem_free_deferred() to
> free the map differently may be appropriate.

Whatever gets the job done is of course goodness, and yes, if the freeing
cannot be batched, my suggestion won't help much.  ;-)

							Thanx, Paul




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux