Re: handling EINTR from bpf_map_lookup_batch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 4, 2025 at 8:19 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> On 2/5/2025 2:08 AM, Yan Zhai wrote:
> > I am getting EINTR when trying to use bpf_map_lookup_batch on an
> > array_of_maps. The error happens when there is a "hole" in the array.
> > For example, say the outer map has max entries of 256, each inner map
> > is used for a transport protocol, and I only populated key 6 and
> > 17 for TCP and UDP. Then when I do batch lookup, I always get EINTR.
> > This so far seems to only happen with array of maps. Does it make
> > sense to allow skipping to the next key for this map type? Something
> > like:
> >
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index c420edbfb7c8..83915a8059ef 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -2027,6 +2027,8 @@ int generic_map_lookup_batch(struct bpf_map *map,
> >                                          attr->batch.elem_flags);
> >
> >                 if (err == -ENOENT) {
> > +                       if (IS_FD_ARRAY(map)
> > +                               goto next_key;
>
> It seems only BPF_MAP_TYPE_ARRAY_OF_MAPS supports batched operation, so
> map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS will be enough. It is also
> better to reset err as 0, otherwise generic_map_lookup_batch may return
> -ENOENT.

Jump to the next key should always restart the loop, thus err will be
correctly set afterwards.

> >                         if (retry) {
> >                                 retry--;
> >                                 continue;
> > @@ -2048,6 +2050,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
> >                         goto free_buf;
> >                 }
> >
> > +next_key:
> >                 if (!prev_key)
> >                         prev_key = buf_prevkey;
> >
>
> Make sense.  Please add a selftest for it. Another way is to return id 0
> for these non-existent values in the fd array, but it may break existed
> prog. Just skipping the empty array slot is better.

Working on it.

thanks
Yan

> > Also the context about my scenario if anyone is curious: I am trying
> > to associate each map to a userspace service in a multi tenant
> > environment. This is an addition to cgroup accounting, in case the
> > creator cgroup goes away, e.g. systemd service restarts always
> > recreate cgroups. And we also want to monitor the utilization level of
> > non-prealloc maps of different tenants. When dealing with inner maps,
> > it is not always trivial. To connect dots I choose to read these IDs
> > periodically and link them to the tenant of the outer map, that's
> > where this EINTR occurred.
> >
> > best
> > Yan
> >
> > .
>





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux