Re: [PATCH v3 bpf-next 2/3] bpf: Introduce task_vma open-coded iterator kfuncs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 22, 2023 at 3:37 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Tue, Aug 22, 2023 at 1:14 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
> >
> >
> >
> > On 8/22/23 12:19 PM, David Marchevsky wrote:
> > > On 8/22/23 1:42 PM, Yonghong Song wrote:
> > >>
> > >>
> > >> On 8/21/23 10:05 PM, Dave Marchevsky wrote:
> > >>> This patch adds kfuncs bpf_iter_task_vma_{new,next,destroy} which allow
> > >>> creation and manipulation of struct bpf_iter_task_vma in open-coded
> > >>> iterator style. BPF programs can use these kfuncs directly or through
> > >>> bpf_for_each macro for natural-looking iteration of all task vmas.
> > >>>
> > >>> The implementation borrows heavily from bpf_find_vma helper's locking -
> > >>> differing only in that it holds the mmap_read lock for all iterations
> > >>> while the helper only executes its provided callback on a maximum of 1
> > >>> vma. Aside from locking, struct vma_iterator and vma_next do all the
> > >>> heavy lifting.
> > >>>
> > >>> The newly-added struct bpf_iter_task_vma has a name collision with a
> > >>> selftest for the seq_file task_vma iter's bpf skel, so the selftests/bpf/progs
> > >>> file is renamed in order to avoid the collision.
> > >>>
> > >>> A pointer to an inner data struct, struct bpf_iter_task_vma_kern_data, is the
> > >>> only field in struct bpf_iter_task_vma. This is because the inner data
> > >>> struct contains a struct vma_iterator (not ptr), whose size is likely to
> > >>> change under us. If bpf_iter_task_vma_kern contained vma_iterator directly
> > >>> such a change would require change in opaque bpf_iter_task_vma struct's
> > >>> size. So better to allocate vma_iterator using BPF allocator, and since
> > >>> that alloc must already succeed, might as well allocate all iter fields,
> > >>> thereby freezing struct bpf_iter_task_vma size.
> > >>>
> > >>> Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx>
> > >>> Cc: Nathan Slingerland <slinger@xxxxxxxx>
> > >>> ---
> > >>>    include/uapi/linux/bpf.h                      |  4 +
> > >>>    kernel/bpf/helpers.c                          |  3 +
> > >>>    kernel/bpf/task_iter.c                        | 84 +++++++++++++++++++
> > >>>    tools/include/uapi/linux/bpf.h                |  4 +
> > >>>    tools/lib/bpf/bpf_helpers.h                   |  8 ++
> > >>>    .../selftests/bpf/prog_tests/bpf_iter.c       | 26 +++---
> > >>>    ...f_iter_task_vma.c => bpf_iter_task_vmas.c} |  0
> > >>>    7 files changed, 116 insertions(+), 13 deletions(-)
> > >>>    rename tools/testing/selftests/bpf/progs/{bpf_iter_task_vma.c => bpf_iter_task_vmas.c} (100%)
> > >>>
> > >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > >>> index 8790b3962e4b..49fc1989a548 100644
> > >>> --- a/include/uapi/linux/bpf.h
> > >>> +++ b/include/uapi/linux/bpf.h
> > >>> @@ -7311,4 +7311,8 @@ struct bpf_iter_num {
> > >>>        __u64 __opaque[1];
> > >>>    } __attribute__((aligned(8)));
> > >>>    +struct bpf_iter_task_vma {
> > >>> +    __u64 __opaque[1]; /* See bpf_iter_num comment above */
> > >>> +} __attribute__((aligned(8)));
> > >>
> > >> In the future, we might have bpf_iter_cgroup, bpf_iter_task, bpf_iter_cgroup_task, etc. They may all use the same struct
> > >> like
> > >>    struct bpf_iter_<...> {
> > >>      __u64 __opaque[1];
> > >>    } __attribute__((aligned(8)));
> > >>
> > >> Maybe we want a generic one instead of having lots of
> > >> structs with the same underline definition? For example,
> > >>    struct bpf_iter_generic
> > >> ?
> > >>
> > >
> > > The bpf_for_each macro assumes a consistent naming scheme for opaque iter struct
> > > and associated kfuncs. Having a 'bpf_iter_generic' shared amongst multiple types
> > > of iters would break the scheme. We could:
> > >
> > >    * Add bpf_for_each_generic that only uses bpf_iter_generic
> > >      * This exposes implementation details in an ugly way, though.
> > >    * Do some macro magic to pick bpf_iter_generic for some types of iters, and
> > >      use consistent naming pattern for others.
> > >      * I'm not sure how to do this with preprocessor
> > >    * Migrate all opaque iter structs to only contain pointer to bpf_mem_alloc'd
> > >      data struct, and use bpf_iter_generic for all of them
> > >      * Probably need to see more iter implementation / usage before making such
> > >        a change
> > >    * Do 'typedef __u64 __aligned(8) bpf_iter_<...>
> > >      * BTF_KIND_TYPEDEF intead of BTF_KIND_STRUCT might throw off some verifier
> > >        logic. Could do similar typedef w/ struct to try to work around
> > >        it.
> > >
> > > Let me know what you think. Personally I considered doing typedef while
> > > implementing this, so that's the alternative I'd choose.
> >
> > Okay, since we have naming convention restriction, typedef probably the
> > best option, something like
> >    typedef struct bpf_iter_num bpf_iter_task_vma
> > ?
> >
> > Verifier might need to be changed if verifier strips all modifiers
> > (including tyypedef) to find the struct name.
>
> I don't quite see how typedef helps here.
> Say we do:
> struct bpf_iter_task_vma {
>      __u64 __opaque[1];
> } __attribute__((aligned(8)));
>
> as Dave is proposing.
> Then tomorrow we add another bpf_iter_foo that is exactly the same opaque[1].
> And we will have bpf_iter_num, bpf_iter_task_vma, bpf_iter_foo structs
> with the same layout. So what? Eye sore?
> In case we need to extend task_vma from 1 to 2 it will be easier to do
> when all of them are separate structs.
>
> And typedef has unknown verification implications.

+1, I wouldn't complicate things and have a proper struct with strict
naming convention for each iter kind

>
> Either way we need to find a way to move these structs from uapi/bpf.h
> along with bpf_rb_root and friends to some "obviously unstable" header.

I don't think we have to add struct bpf_iter_task_vma to uapi/bpf.h at
all and rely on it coming from vmlinux.h. As I mentioned in previous
email, num iterator is a bit special in its generality and simplicity
(which allows to be confident it won't need to be changed), while
other iterators should be treated as more unstable and come from
vmlinux.h (or wherever else kfuncs will be coming from in the future).

tl;dr: we can define this struct right next to where we defined
bpf_iter_task_vma_new() kfunc. Unless I'm missing some complication,
of course.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux