Re: [PATCH bpf-next v3 2/3] bpf: Support kptrs in local storage maps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 27, 2023 at 10:16 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
>
> On 2/25/23 7:40 AM, Kumar Kartikeya Dwivedi wrote:
> > Enable support for kptrs in local storage maps by wiring up the freeing
> > of these kptrs from map value. Freeing of bpf_local_storage_map is only

from a map value.

> > delayed in case there are special fields, therefore bpf_selem_free_*

This needs a bit of explanation here, can you add a bit more
description in this commit's log as to what these special fields are?
It would be great if the commit descriptions are hermetic and self
explanatory.

> > path can also only dereference smap safely in that case. This is
> > recorded using a bool utilizing a hole in bpF_local_storage_elem. It

nit: bpf_local_storage_elem

> > could have been tagged in the pointer value smap using the lowest bit
> > (since alignment > 1), but since there was already a hole I went with
> > the simpler option. Only the map structure freeing is delayed using RCU
> > barriers, as the buckets aren't used when selem is being freed, so they
> > can be freed once all readers of the bucket lists can no longer access
> > it.
> >
> > Cc: Martin KaFai Lau <martin.lau@xxxxxxxxxx>
> > Cc: KP Singh <kpsingh@xxxxxxxxxx>
> > Cc: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx>
> > ---
> >   include/linux/bpf_local_storage.h |  6 ++++
> >   kernel/bpf/bpf_local_storage.c    | 48 ++++++++++++++++++++++++++++---
> >   kernel/bpf/syscall.c              |  6 +++-
> >   kernel/bpf/verifier.c             | 12 +++++---
> >   4 files changed, 63 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
> > index 6d37a40cd90e..0fe92986412b 100644
> > --- a/include/linux/bpf_local_storage.h
> > +++ b/include/linux/bpf_local_storage.h
> > @@ -74,6 +74,12 @@ struct bpf_local_storage_elem {
> >       struct hlist_node snode;        /* Linked to bpf_local_storage */
> >       struct bpf_local_storage __rcu *local_storage;
> >       struct rcu_head rcu;
> > +     bool can_use_smap; /* Is it safe to access smap in bpf_selem_free_* RCU
> > +                         * callbacks? bpf_local_storage_map_free only
> > +                         * executes rcu_barrier when there are special
> > +                         * fields, this field remembers that to ensure we
> > +                         * don't access already freed smap in sdata.
> > +                         */
> >       /* 8 bytes hole */
> >       /* The data is stored in another cacheline to minimize
> >        * the number of cachelines access during a cache hit.
> > diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
> > index 58da17ae5124..2bdd722fe293 100644
> > --- a/kernel/bpf/bpf_local_storage.c
> > +++ b/kernel/bpf/bpf_local_storage.c
> > @@ -85,6 +85,7 @@ bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner,
> >       if (selem) {
> >               if (value)
> >                       copy_map_value(&smap->map, SDATA(selem)->data, value);
> > +             /* No need to call check_and_init_map_value as memory is zero init */
> >               return selem;
> >       }
> >
> > @@ -113,10 +114,25 @@ static void bpf_selem_free_rcu(struct rcu_head *rcu)
> >       struct bpf_local_storage_elem *selem;
> >
> >       selem = container_of(rcu, struct bpf_local_storage_elem, rcu);
> > +     /* The can_use_smap bool is set whenever we need to free additional
> > +      * fields in selem data before freeing selem. bpf_local_storage_map_free
> > +      * only executes rcu_barrier to wait for RCU callbacks when it has
> > +      * special fields, hence we can only conditionally dereference smap, as
> > +      * by this time the map might have already been freed without waiting
> > +      * for our call_rcu callback if it did not have any special fields.
> > +      */
> > +     if (selem->can_use_smap)
> > +             bpf_obj_free_fields(SDATA(selem)->smap->map.record, SDATA(selem)->data);
> > +     kfree(selem);
> > +}
> > +
> > +static void bpf_selem_free_tasks_trace_rcu(struct rcu_head *rcu)
> nit. May be a shorter name, bpf_selem_free_rcu_trace() ?
>
> > +{
> > +     /* Free directly if Tasks Trace RCU GP also implies RCU GP */
> >       if (rcu_trace_implies_rcu_gp())
> > -             kfree(selem);
> > +             bpf_selem_free_rcu(rcu);
> >       else
> > -             kfree_rcu(selem, rcu);
> > +             call_rcu(rcu, bpf_selem_free_rcu);
> >   }
> >
> >   /* local_storage->lock must be held and selem->local_storage == local_storage.
> > @@ -170,9 +186,9 @@ static bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_stor
> >               RCU_INIT_POINTER(local_storage->cache[smap->cache_idx], NULL);
> >
> >       if (use_trace_rcu)
> > -             call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_rcu);
> > +             call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_tasks_trace_rcu);
> >       else
> > -             kfree_rcu(selem, rcu);
> > +             call_rcu(&selem->rcu, bpf_selem_free_rcu);
>
> Instead of adding 'bool can_use_smap' to 'struct bpf_local_storage_elem', can it
> be a different rcu call back when smap->map.record is not NULL and only that new
> rcu call back can use smap?
> I have a use on this 8-byte hole when using bpf_mem_alloc in bpf_local_storage.
>
> >
> >       return free_local_storage;
> >   }
> > @@ -240,6 +256,11 @@ void bpf_selem_link_map(struct bpf_local_storage_map *smap,
> >       RCU_INIT_POINTER(SDATA(selem)->smap, smap);
> >       hlist_add_head_rcu(&selem->map_node, &b->list);
> >       raw_spin_unlock_irqrestore(&b->lock, flags);
> > +
> > +     /* If our data will have special fields, smap will wait for us to use
> > +      * its record in bpf_selem_free_* RCU callbacks before freeing itself.
> > +      */
> > +     selem->can_use_smap = !IS_ERR_OR_NULL(smap->map.record);
> >   }
> >
> >   void bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool use_trace_rcu)
> > @@ -723,6 +744,25 @@ void bpf_local_storage_map_free(struct bpf_map *map,
> >        */
> >       synchronize_rcu();
> >
> > +     /* Only delay freeing of smap, buckets are not needed anymore */
> >       kvfree(smap->buckets);
> > +
> > +     /* When local storage has special fields, callbacks for
> > +      * bpf_selem_free_rcu and bpf_selem_free_tasks_trace_rcu will keep using
> > +      * the map BTF record, we need to execute an RCU barrier to wait for
> > +      * them as the record will be freed right after our map_free callback.
> > +      */
> > +     if (!IS_ERR_OR_NULL(smap->map.record)) {
> > +             rcu_barrier_tasks_trace();
> > +             /* We cannot skip rcu_barrier() when rcu_trace_implies_rcu_gp()
> > +              * is true, because while call_rcu invocation is skipped in that
> > +              * case in bpf_selem_free_tasks_trace_rcu (and all local storage
> > +              * maps pass use_trace_rcu = true), there can be call_rcu
> > +              * callbacks based on use_trace_rcu = false in the earlier while
> > +              * ((selem = ...)) loop or from bpf_local_storage_unlink_nolock
> > +              * called from owner's free path.
> > +              */
> > +             rcu_barrier();
>
> Others lgtm.

+1

>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux