On 8/10/23 5:57 PM, Stanislav Fomichev wrote: > On 08/10, Dave Marchevsky wrote: >> This patch adds kfuncs bpf_iter_task_vma_{new,next,destroy} which allow >> creation and manipulation of struct bpf_iter_task_vma in open-coded >> iterator style. BPF programs can use these kfuncs directly or through >> bpf_for_each macro for natural-looking iteration of all task vmas. >> >> The implementation borrows heavily from bpf_find_vma helper's locking - >> differing only in that it holds the mmap_read lock for all iterations >> while the helper only executes its provided callback on a maximum of 1 >> vma. Aside from locking, struct vma_iterator and vma_next do all the >> heavy lifting. >> >> The newly-added struct bpf_iter_task_vma has a name collision with a >> selftest for the seq_file task_vma iter's bpf skel, so the selftests/bpf/progs >> file is renamed in order to avoid the collision. >> >> Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx> >> Cc: Nathan Slingerland <slinger@xxxxxxxx> >> --- >> include/uapi/linux/bpf.h | 5 ++ >> kernel/bpf/helpers.c | 3 + >> kernel/bpf/task_iter.c | 56 +++++++++++++++++++ >> tools/include/uapi/linux/bpf.h | 5 ++ >> tools/lib/bpf/bpf_helpers.h | 8 +++ >> .../selftests/bpf/prog_tests/bpf_iter.c | 26 ++++----- >> ...f_iter_task_vma.c => bpf_iter_task_vmas.c} | 0 >> 7 files changed, 90 insertions(+), 13 deletions(-) >> rename tools/testing/selftests/bpf/progs/{bpf_iter_task_vma.c => bpf_iter_task_vmas.c} (100%) >> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >> index d21deb46f49f..c4a65968f9f5 100644 >> --- a/include/uapi/linux/bpf.h >> +++ b/include/uapi/linux/bpf.h >> @@ -7291,4 +7291,9 @@ struct bpf_iter_num { >> __u64 __opaque[1]; >> } __attribute__((aligned(8))); >> >> +struct bpf_iter_task_vma { > > [..] > >> + __u64 __opaque[9]; /* See bpf_iter_num comment above */ >> + char __opaque_c[3]; > > Everything in the series makes sense, but this part is a big confusing > when reading without too much context. If you're gonna do a respin, maybe: > > - __opaque_c[8*9+3] (or whatever the size is)? any reason for separate > __u64 + char? IIUC this is because BTF generation doesn't pick up __attribute__((aligned(8))), so if a vmlinux.h is generated via 'bpftool btf dump file vmlinux format c' and this struct only contains chars, it won't have the correct alignment. I'm not sure if the bitfield approach taken by bpf_{list,rb}_node similar has the same effect. Some quick googling indicates that if it does, it's probably not in the C standard. But yeah, I agree that it's ugly. While we're on the topic, WDYT about my comment in the cover letter about this struct (copied here for convenience): * The struct vma_iterator wrapped by struct bpf_iter_task_vma itself wraps struct ma_state. Because we need the entire struct, not a ptr, changes to either struct vma_iterator or struct ma_state will necessitate changing the opaque struct bpf_iter_task_vma to account for the new size. This feels a bit brittle. We could instead use bpf_mem_alloc to allocate a struct vma_iterator in bpf_iter_task_vma_new and have struct bpf_iter_task_vma point to that, but that's not quite equivalent as BPF progs will usually use the stack for this struct via bpf_for_each. Went with the simpler route for now. > - maybe worth adding something like /* Opaque representation of > bpf_iter_task_vma_kern; see bpf_iter_num comment above */. > that bpf_iter_task_vma<>bpf_iter_task_vma_kern wasn't super apparent > until I got to the BUG_ON part It feels weird to refer to the non-UAPI _kern struct in uapi header. Maybe better to add a comment to the _kern struct referring to this one? I don't feel strongly either way, though.