On Fri, May 28, 2021 at 2:28 PM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > In the case of tracing, it's different. You install small programs that are > triggered when certain events fire. Random example from bpftrace's README [0], > you want to generate a histogram of syscall counts by program. One-liner is: > > bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }' > > bpftrace then goes and generates a BPF prog from this internally. One way of > doing it could be to call bpf_get_current_task() helper and then access > current->comm via one of bpf_probe_read_kernel{,_str}() helpers ... I think we can all agree that the BPF tracing is a bit chaotic in the sense that the tracing programs can be executed in various places/contexts and that presents some challenges with respect to access control and auditing. If you are following the io_uring stuff that is going on now you can see a little of what is required to make audit work properly in the various io_uring contexts and that is relatively small compared to what is possible with BPF tracing. Of course this assumes I've managed to understand bpf tracing properly this morning, and I very well may still be missing points and/or confused about some of the important details. Corrections are welcome. Daniel's patch side steps that worry by just doing the lockdown permission check when the BPF program is loaded, but that isn't a great solution if the policy changes afterward. I was hoping there might be some way to perform the permission check as needed, but the more I look the more that appears to be difficult, if not impossible (once again, corrections are welcome). I'm now wondering if the right solution here is to make use of the LSM notifier mechanism. I'm not yet entirely sure if this would work from a BPF perspective, but I could envision the BPF subsystem registering a LSM notification callback via register_blocking_lsm_notifier(), see if Infiniband code as an example, and then when the LSM(s) policy changes the BPF subsystem would get a notification and it could revalidate the existing BPF programs and take block/remove/whatever the offending BPF programs. This obviously requires a few things which I'm not sure are easily done, or even possible: 1. Somehow the BPF programs would need to be "marked" at load/verification time with respect to their lockdown requirements so that decisions can be made later. Perhaps a flag in bpf_prog_aux? 2. While it looks like it should be possible to iterate over all of the loaded BPF programs in the LSM notifier callback via idr_for_each(prog_idr, ...), it is not clear to me if it is possible to safely remove, or somehow disable, BPF programs once they have been loaded. Hopefully the BPF folks can help answer that question. 3. Disabling of BPF programs might be preferable to removing them entirely on LSM policy changes as it would be possible to make the lockdown state less restrictive at a future point in time, allowing for the BPF program to be executed again. Once again, not sure if this is even possible. Related, the lockdown LSM should probably also grow LSM notifier support similar to selinux_lsm_notifier_avc_callback(), for example either lock_kernel_down() or lockdown_write() might want to do a call_blocking_lsm_notifier(LSM_POLICY_CHANGE, NULL) call. -- paul moore www.paul-moore.com