Re: [PATCH bpf-next v5 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 10, 2024 at 02:01:53PM +0000, Juntong Deng wrote:
> This patch series adds open-coded style process file iterator
> bpf_iter_task_file and bpf_fget_task() kfunc, and corresponding
> selftests test cases.
> 
> In addition, since fs kfuncs is generic and useful for scenarios
> other than LSM, this patch makes fs kfuncs available for SYSCALL
> and TRACING program types [0].
> 
> [0]: https://lore.kernel.org/bpf/CAPhsuW6ud21v2xz8iSXf=CiDL+R_zpQ+p8isSTMTw=EiJQtRSw@xxxxxxxxxxxxxx/
> 
> Although iter/task_file already exists, for CRIB we still need the

What is CRIB?

> open-coded iterator style process file iterator, and the same is true
> for other bpf iterators such as iter/tcp, iter/udp, etc.
> 
> The traditional bpf iterator is more like a bpf version of procfs, but
> similar to procfs, it is not suitable for CRIB scenarios that need to
> obtain large amounts of complex, multi-level in-kernel information.
> 
> The following is from previous discussions [2].
> 
> [2]: https://lore.kernel.org/bpf/AM6PR03MB5848CA34B5B68C90F210285E99B12@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> 
> This is because the context of bpf iterators is fixed and bpf iterators
> cannot be nested. This means that a bpf iterator program can only
> complete a specific small iterative dump task, and cannot dump
> multi-level data.
> 
> An example, when we need to dump all the sockets of a process, we need
> to iterate over all the files (sockets) of the process, and iterate over
> the all packets in the queue of each socket, and iterate over all data
> in each packet.
> 
> If we use bpf iterator, since the iterator can not be nested, we need to
> use socket iterator program to get all the basic information of all
> sockets (pass pid as filter), and then use packet iterator program to
> get the basic information of all packets of a specific socket (pass pid,
> fd as filter), and then use packet data iterator program to get all the
> data of a specific packet (pass pid, fd, packet index as filter).
> 
> This would be complicated and require a lot of (each iteration)
> bpf program startup and exit (leading to poor performance).
> 
> By comparison, open coded iterator is much more flexible, we can iterate
> in any context, at any time, and iteration can be nested, so we can
> achieve more flexible and more elegant dumping through open coded
> iterators.
> 
> With open coded iterators, all of the above can be done in a single
> bpf program, and with nested iterators, everything becomes compact
> and simple.
> 
> Also, bpf iterators transmit data to user space through seq_file,
> which involves a lot of open (bpf_iter_create), read, close syscalls,
> context switching, memory copying, and cannot achieve the performance
> of using ringbuf.
> 
> Signed-off-by: Juntong Deng <juntong.deng@xxxxxxxxxxx>
> ---
> v4 -> v5:
> * Add file type checks in test cases for process file iterator
>   and bpf_fget_task().
> 
> * Use fentry to synchronize tests instead of waiting in a loop.
> 
> * Remove path_d_path_kfunc_non_lsm test case.
> 
> * Replace task_lookup_next_fdget_rcu() with fget_task_next().
> 
> * Remove future merge conflict section in cover letter (resolved).
> 
> v3 -> v4:
> * Make all kfuncs generic, not CRIB specific.
> 
> * Move bpf_fget_task to fs/bpf_fs_kfuncs.c.
> 
> * Remove bpf_iter_task_file_get_fd and bpf_get_file_ops_type.
> 
> * Use struct bpf_iter_task_file_item * as the return value of
>   bpf_iter_task_file_next.
> 
> * Change fd to unsigned int type and add next_fd.
> 
> * Add KF_RCU_PROTECTED to bpf_iter_task_file_new.
> 
> * Make fs kfuncs available to SYSCALL and TRACING program types.
> 
> * Update all relevant test cases.
> 
> * Remove the discussion section from cover letter.
> 
> v2 -> v3:
> * Move task_file open-coded iterator to kernel/bpf/helpers.c.
> 
> * Fix duplicate error code 7 in test_bpf_iter_task_file().
> 
> * Add comment for case when bpf_iter_task_file_get_fd() returns -1.
> 
> * Add future plans in commit message of "Add struct file related
>   CRIB kfuncs".
> 
> * Add Discussion section to cover letter.
> 
> v1 -> v2:
> * Fix a type definition error in the fd parameter of
>   bpf_fget_task() at crib_common.h.
> 
> Juntong Deng (5):
>   bpf: Introduce task_file open-coded iterator kfuncs
>   selftests/bpf: Add tests for open-coded style process file iterator
>   bpf: Add bpf_fget_task() kfunc
>   bpf: Make fs kfuncs available for SYSCALL and TRACING program types
>   selftests/bpf: Add tests for bpf_fget_task() kfunc
> 
>  fs/bpf_fs_kfuncs.c                            |  42 ++++---
>  kernel/bpf/helpers.c                          |   3 +
>  kernel/bpf/task_iter.c                        |  92 ++++++++++++++
>  .../testing/selftests/bpf/bpf_experimental.h  |  15 +++
>  .../selftests/bpf/prog_tests/fs_kfuncs.c      |  46 +++++++
>  .../testing/selftests/bpf/prog_tests/iters.c  |  79 ++++++++++++
>  .../selftests/bpf/progs/fs_kfuncs_failure.c   |  33 +++++
>  .../selftests/bpf/progs/iters_task_file.c     |  88 ++++++++++++++
>  .../bpf/progs/iters_task_file_failure.c       | 114 ++++++++++++++++++
>  .../selftests/bpf/progs/test_fget_task.c      |  63 ++++++++++
>  .../selftests/bpf/progs/verifier_vfs_reject.c |  10 --
>  11 files changed, 559 insertions(+), 26 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/progs/fs_kfuncs_failure.c
>  create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file.c
>  create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file_failure.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_fget_task.c
> 
> -- 
> 2.39.5
> 




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux