Re: [PATCH v3 bpf-next 2/3] bpf: Introduce task_vma open-coded iterator kfuncs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/22/23 3:36 PM, Alexei Starovoitov wrote:
On Tue, Aug 22, 2023 at 1:14 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:



On 8/22/23 12:19 PM, David Marchevsky wrote:
On 8/22/23 1:42 PM, Yonghong Song wrote:


On 8/21/23 10:05 PM, Dave Marchevsky wrote:
This patch adds kfuncs bpf_iter_task_vma_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_task_vma in open-coded
iterator style. BPF programs can use these kfuncs directly or through
bpf_for_each macro for natural-looking iteration of all task vmas.

The implementation borrows heavily from bpf_find_vma helper's locking -
differing only in that it holds the mmap_read lock for all iterations
while the helper only executes its provided callback on a maximum of 1
vma. Aside from locking, struct vma_iterator and vma_next do all the
heavy lifting.

The newly-added struct bpf_iter_task_vma has a name collision with a
selftest for the seq_file task_vma iter's bpf skel, so the selftests/bpf/progs
file is renamed in order to avoid the collision.

A pointer to an inner data struct, struct bpf_iter_task_vma_kern_data, is the
only field in struct bpf_iter_task_vma. This is because the inner data
struct contains a struct vma_iterator (not ptr), whose size is likely to
change under us. If bpf_iter_task_vma_kern contained vma_iterator directly
such a change would require change in opaque bpf_iter_task_vma struct's
size. So better to allocate vma_iterator using BPF allocator, and since
that alloc must already succeed, might as well allocate all iter fields,
thereby freezing struct bpf_iter_task_vma size.

Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx>
Cc: Nathan Slingerland <slinger@xxxxxxxx>
---
    include/uapi/linux/bpf.h                      |  4 +
    kernel/bpf/helpers.c                          |  3 +
    kernel/bpf/task_iter.c                        | 84 +++++++++++++++++++
    tools/include/uapi/linux/bpf.h                |  4 +
    tools/lib/bpf/bpf_helpers.h                   |  8 ++
    .../selftests/bpf/prog_tests/bpf_iter.c       | 26 +++---
    ...f_iter_task_vma.c => bpf_iter_task_vmas.c} |  0
    7 files changed, 116 insertions(+), 13 deletions(-)
    rename tools/testing/selftests/bpf/progs/{bpf_iter_task_vma.c => bpf_iter_task_vmas.c} (100%)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 8790b3962e4b..49fc1989a548 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -7311,4 +7311,8 @@ struct bpf_iter_num {
        __u64 __opaque[1];
    } __attribute__((aligned(8)));
    +struct bpf_iter_task_vma {
+    __u64 __opaque[1]; /* See bpf_iter_num comment above */
+} __attribute__((aligned(8)));

In the future, we might have bpf_iter_cgroup, bpf_iter_task, bpf_iter_cgroup_task, etc. They may all use the same struct
like
    struct bpf_iter_<...> {
      __u64 __opaque[1];
    } __attribute__((aligned(8)));

Maybe we want a generic one instead of having lots of
structs with the same underline definition? For example,
    struct bpf_iter_generic
?


The bpf_for_each macro assumes a consistent naming scheme for opaque iter struct
and associated kfuncs. Having a 'bpf_iter_generic' shared amongst multiple types
of iters would break the scheme. We could:

    * Add bpf_for_each_generic that only uses bpf_iter_generic
      * This exposes implementation details in an ugly way, though.
    * Do some macro magic to pick bpf_iter_generic for some types of iters, and
      use consistent naming pattern for others.
      * I'm not sure how to do this with preprocessor
    * Migrate all opaque iter structs to only contain pointer to bpf_mem_alloc'd
      data struct, and use bpf_iter_generic for all of them
      * Probably need to see more iter implementation / usage before making such
        a change
    * Do 'typedef __u64 __aligned(8) bpf_iter_<...>
      * BTF_KIND_TYPEDEF intead of BTF_KIND_STRUCT might throw off some verifier
        logic. Could do similar typedef w/ struct to try to work around
        it.

Let me know what you think. Personally I considered doing typedef while
implementing this, so that's the alternative I'd choose.

Okay, since we have naming convention restriction, typedef probably the
best option, something like
    typedef struct bpf_iter_num bpf_iter_task_vma
?

Verifier might need to be changed if verifier strips all modifiers
(including tyypedef) to find the struct name.

I don't quite see how typedef helps here.
Say we do:
struct bpf_iter_task_vma {
      __u64 __opaque[1];
} __attribute__((aligned(8)));

as Dave is proposing.
Then tomorrow we add another bpf_iter_foo that is exactly the same opaque[1].
And we will have bpf_iter_num, bpf_iter_task_vma, bpf_iter_foo structs
with the same layout. So what? Eye sore?
In case we need to extend task_vma from 1 to 2 it will be easier to do
when all of them are separate structs.

And typedef has unknown verification implications.

This is true. Some investigation is needed.


Either way we need to find a way to move these structs from uapi/bpf.h
along with bpf_rb_root and friends to some "obviously unstable" header.

If we move this out uapi/bpf.h to an unstable header, we will have much
more flexibility for future change. Original idea (v1) to allocate the
whole struct on the stack would be okay even if kernel changes
struct vma_iterator size, whether bigger or smaller.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux