On Sat, Oct 21, 2023 at 4:19 PM Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > On 2023/10/04 0:09, KP Singh wrote: > >> What I expected is "allocate memory where amount is determined at runtime" (e.g. alloc(), realloc()). > > > > One can use dynamically sized allocations on the ring buffer with > > dynamic pointers: > > > > http://vger.kernel.org/bpfconf2022_material/lsfmmbpf2022-dynptr.pdf > > > > Furthermore, there are some use cases that seemingly need dynamic > > memory allocation but not really. e.g. there was a need to audit > > command line arguments and while it seems dynamic and one can chunk > > the allocation to finite sizes, put these on a ring buffer and process > > the chunks. > > > > It would be nice to see more details of where the dynamic allocation > > is needed. Security blobs are allocated dynamically but have a fixed > > size. > > Dynamic allocation is not for security blobs. Dynamic allocation is for > holding requested pathnames (short-lived allocation), holding audit logs > (FIFO allocation), holding/appending access control rules (long-lived This is a ring buffer, BPF already has one and used for the very use case you mentioned (audit logs). Please read the original RFC and patches for BPF LSM. We have deployed this at scale and it's very efficient (memory and compute wise). > allocation). This is a map, not all maps need to be preallocated. An access control rule can fundamentally be implemented as a map. I recommend reading most of the BPF selftests to learn what can be done / accomplished. > > > > >> Some of core requirements for implementing TOMOYO/AKARI/CaitSith-like programs > >> using BPF will be: > >> > >> The program registered cannot be stopped/removed by the root user. > >> This is made possible by either building the program into vmlinux or loading > >> the program as a LKM without module_exit() callback. Is it possible to guaranee > >> that a BPF program cannot be stopped/removed by user's operations? > > > > Yes, there is a security_bpf hook where a BPF MAC policy can be > > implemented and other LSMs do that already. > > > >> > >> The program registered cannot be terminated by safety mechanisms (e.g. excessive > >> CPU time consumption). Are there mechanisms in BPF that wouldn't have terminated > >> a program if the program were implemented as a LKM rather than a BPF program? > >> > > > > The kernel does not terminate BPF LSM programs, once a BPF program is > > loaded and attached to the LSM hook, it's JITed into a native code. > > From there onwards, as far as the kernel is concerned it's just like > > any other kernel function. > > I was finally able to build and load tools/testing/selftests/bpf/progs/lsm.c and > tools/testing/selftests/bpf/prog_tests/test_lsm.c , and I found fatal limitation Programs can also be pinned on /sys/bpf similar to maps, this allows them to persist even after the loading program goes away. Here's an example of a pinned program: https://elixir.bootlin.com/linux/latest/source/tools/testing/selftests/bpf/flow_dissector_load.c#L39 > that the program registered is terminated when the file descriptor which refers to > tools/testing/selftests/bpf/lsm.bpf.o is closed (due to e.g. process termination). > That is, eBPF programs are not reliable/robust enough to implement TOMOYO/AKARI/ > CaitSith-like programs. Re-registering when the file descriptor is closed is racy Not needed as programs can be pinned too. > because some critical operations might fail to be traced/checked by the LSM hooks. > > Also, I think that automatic cleanup upon closing the file descriptor implies that > allocating resources (or getting reference counts) that are not managed by the BPF > (e.g. files under /sys/kernel/securitytomoyo/ directory) is not permitted. That's > very bad. > > > > >> > >> Amount of memory needed for managing data is not known at compile time. Thus, I need > >> kmalloc()-like memory allocation mechanism rather than allocating from some pool, and > >> manage chunk of memory regions using linked list. Does BPF have kmalloc()-like memory > >> allocation mechanism that allows allocating up to 32KB (8 pages if PAGE_SIZE=4096). > >> > > > > You use the ring buffer as a large pool and use dynamic pointers to > > carve chunks out of it, if truly dynamic memory is needed. > > TOMOYO/AKARI/CaitSith-like programs do need dynamic memory allocation, as max amount of > memory varies from less than 1MB to more than 10MB. Preallocation is too much wasteful. > > > > > > >> And maybe somewhere documented question: > >> > >> What kernel functions can a BPF program call / what kernel data can a BPF program access? > > > > BPF programs can access kernel data dynamically (accesses relocated at > > load time without needing a recompile) There are lot of good details > > in: > > > > https://nakryiko.com/posts/bpf-core-reference-guide/ > > > > > >> The tools/testing/selftests/bpf/progs/test_d_path.c suggests that a BPF program can call > >> d_path() defined in fs/d_path.c . But is that because d_path() is marked as EXPORT_SYMBOL() ? > >> Or can a BPF program call almost all functions (like SystemTap script can insert hooks into > >> almost all functions)? Even functions / data in LKM can be accessed by a BPF program? > >> > > > > It's not all kernel functions, but there is a wide range of helpers > > and kfuncs (examples in tools/testing/selftests/bpf) and if there is > > something missing, we will help you. > > I couldn't build tools/testing/selftests/bpf/progs/lsm.c with printk() added. > Sending to /sys/kernel/debug/tracing/trace_pipe via bpf_printk() is not enough for > reporting critical/urgent problems. Synchronous operation is important. you cannot call any function from within BPF. If you need to call something they need to be exported as a kfunc (you need to send patches on the mailing list for it). This is because we want to ensure that BPF programs can be verified. > > Since printk() is not callable, most of functions which TOMOYO/AKARI/CaitSith-like > programs use seem to be not callable. It seems like you are trying to 1:1 re-implement an existing LSM's code base in BPF, that's surely not going to work. You need to think about the use-case / policy you are trying to implement and then write the code in BPF independently. Please share concrete examples of the policy you want to implement and we try to help you. Asking for features where you want a 1:1 parity with kernel code without concrete policy use-cases is not going to enable us to help you. - KP >