Re: [PATCH bpf-next v5 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024/12/10 14:44, Christian Brauner wrote:
On Tue, Dec 10, 2024 at 02:01:53PM +0000, Juntong Deng wrote:
This patch series adds open-coded style process file iterator
bpf_iter_task_file and bpf_fget_task() kfunc, and corresponding
selftests test cases.

In addition, since fs kfuncs is generic and useful for scenarios
other than LSM, this patch makes fs kfuncs available for SYSCALL
and TRACING program types [0].

[0]: https://lore.kernel.org/bpf/CAPhsuW6ud21v2xz8iSXf=CiDL+R_zpQ+p8isSTMTw=EiJQtRSw@xxxxxxxxxxxxxx/

Although iter/task_file already exists, for CRIB we still need the

What is CRIB?


CRIB is Checkpoint/Restore In eBPF.

Introductions to CRIB can be found at the following links:

https://lpc.events/event/18/contributions/1812/
https://lore.kernel.org/bpf/AM6PR03MB58488045E4D0FA6AEDC8BDE099A52@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#u
https://lwn.net/Articles/984313/

open-coded iterator style process file iterator, and the same is true
for other bpf iterators such as iter/tcp, iter/udp, etc.

The traditional bpf iterator is more like a bpf version of procfs, but
similar to procfs, it is not suitable for CRIB scenarios that need to
obtain large amounts of complex, multi-level in-kernel information.

The following is from previous discussions [2].

[2]: https://lore.kernel.org/bpf/AM6PR03MB5848CA34B5B68C90F210285E99B12@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

This is because the context of bpf iterators is fixed and bpf iterators
cannot be nested. This means that a bpf iterator program can only
complete a specific small iterative dump task, and cannot dump
multi-level data.

An example, when we need to dump all the sockets of a process, we need
to iterate over all the files (sockets) of the process, and iterate over
the all packets in the queue of each socket, and iterate over all data
in each packet.

If we use bpf iterator, since the iterator can not be nested, we need to
use socket iterator program to get all the basic information of all
sockets (pass pid as filter), and then use packet iterator program to
get the basic information of all packets of a specific socket (pass pid,
fd as filter), and then use packet data iterator program to get all the
data of a specific packet (pass pid, fd, packet index as filter).

This would be complicated and require a lot of (each iteration)
bpf program startup and exit (leading to poor performance).

By comparison, open coded iterator is much more flexible, we can iterate
in any context, at any time, and iteration can be nested, so we can
achieve more flexible and more elegant dumping through open coded
iterators.

With open coded iterators, all of the above can be done in a single
bpf program, and with nested iterators, everything becomes compact
and simple.

Also, bpf iterators transmit data to user space through seq_file,
which involves a lot of open (bpf_iter_create), read, close syscalls,
context switching, memory copying, and cannot achieve the performance
of using ringbuf.

Signed-off-by: Juntong Deng <juntong.deng@xxxxxxxxxxx>
---
v4 -> v5:
* Add file type checks in test cases for process file iterator
   and bpf_fget_task().

* Use fentry to synchronize tests instead of waiting in a loop.

* Remove path_d_path_kfunc_non_lsm test case.

* Replace task_lookup_next_fdget_rcu() with fget_task_next().

* Remove future merge conflict section in cover letter (resolved).

v3 -> v4:
* Make all kfuncs generic, not CRIB specific.

* Move bpf_fget_task to fs/bpf_fs_kfuncs.c.

* Remove bpf_iter_task_file_get_fd and bpf_get_file_ops_type.

* Use struct bpf_iter_task_file_item * as the return value of
   bpf_iter_task_file_next.

* Change fd to unsigned int type and add next_fd.

* Add KF_RCU_PROTECTED to bpf_iter_task_file_new.

* Make fs kfuncs available to SYSCALL and TRACING program types.

* Update all relevant test cases.

* Remove the discussion section from cover letter.

v2 -> v3:
* Move task_file open-coded iterator to kernel/bpf/helpers.c.

* Fix duplicate error code 7 in test_bpf_iter_task_file().

* Add comment for case when bpf_iter_task_file_get_fd() returns -1.

* Add future plans in commit message of "Add struct file related
   CRIB kfuncs".

* Add Discussion section to cover letter.

v1 -> v2:
* Fix a type definition error in the fd parameter of
   bpf_fget_task() at crib_common.h.

Juntong Deng (5):
   bpf: Introduce task_file open-coded iterator kfuncs
   selftests/bpf: Add tests for open-coded style process file iterator
   bpf: Add bpf_fget_task() kfunc
   bpf: Make fs kfuncs available for SYSCALL and TRACING program types
   selftests/bpf: Add tests for bpf_fget_task() kfunc

  fs/bpf_fs_kfuncs.c                            |  42 ++++---
  kernel/bpf/helpers.c                          |   3 +
  kernel/bpf/task_iter.c                        |  92 ++++++++++++++
  .../testing/selftests/bpf/bpf_experimental.h  |  15 +++
  .../selftests/bpf/prog_tests/fs_kfuncs.c      |  46 +++++++
  .../testing/selftests/bpf/prog_tests/iters.c  |  79 ++++++++++++
  .../selftests/bpf/progs/fs_kfuncs_failure.c   |  33 +++++
  .../selftests/bpf/progs/iters_task_file.c     |  88 ++++++++++++++
  .../bpf/progs/iters_task_file_failure.c       | 114 ++++++++++++++++++
  .../selftests/bpf/progs/test_fget_task.c      |  63 ++++++++++
  .../selftests/bpf/progs/verifier_vfs_reject.c |  10 --
  11 files changed, 559 insertions(+), 26 deletions(-)
  create mode 100644 tools/testing/selftests/bpf/progs/fs_kfuncs_failure.c
  create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file.c
  create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file_failure.c
  create mode 100644 tools/testing/selftests/bpf/progs/test_fget_task.c

--
2.39.5






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux