On Mon, May 31, 2021 at 4:24 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > On 5/29/21 8:48 PM, Paul Moore wrote: > [...] > > Daniel's patch side steps that worry by just doing the lockdown > > permission check when the BPF program is loaded, but that isn't a > > great solution if the policy changes afterward. I was hoping there > > might be some way to perform the permission check as needed, but the > > more I look the more that appears to be difficult, if not impossible > > (once again, corrections are welcome). > > Your observation is correct, will try to clarify below a bit. > > > I'm now wondering if the right solution here is to make use of the LSM > > notifier mechanism. I'm not yet entirely sure if this would work from > > a BPF perspective, but I could envision the BPF subsystem registering > > a LSM notification callback via register_blocking_lsm_notifier(), see > > if Infiniband code as an example, and then when the LSM(s) policy > > changes the BPF subsystem would get a notification and it could > > revalidate the existing BPF programs and take block/remove/whatever > > the offending BPF programs. This obviously requires a few things > > which I'm not sure are easily done, or even possible: > > > > 1. Somehow the BPF programs would need to be "marked" at > > load/verification time with respect to their lockdown requirements so > > that decisions can be made later. Perhaps a flag in bpf_prog_aux? > > > > 2. While it looks like it should be possible to iterate over all of > > the loaded BPF programs in the LSM notifier callback via > > idr_for_each(prog_idr, ...), it is not clear to me if it is possible > > to safely remove, or somehow disable, BPF programs once they have been > > loaded. Hopefully the BPF folks can help answer that question. > > > > 3. Disabling of BPF programs might be preferable to removing them > > entirely on LSM policy changes as it would be possible to make the > > lockdown state less restrictive at a future point in time, allowing > > for the BPF program to be executed again. Once again, not sure if > > this is even possible. > > Part of why this gets really complex/impossible is that BPF programs in > the kernel are reference counted from various sides, be it that there > are references from user space to them (fd from application, BPF fs, or > BPF links), hooks where they are attached to as well as tail call maps > where one BPF prog calls into another. There is currently also no global > infra of some sort where you could piggy back to atomically keep track of > all the references in a list or such. And the other thing is that BPF progs > have no ownership that is tied to a specific task after they have been > loaded. Meaning, once they are loaded into the kernel by an application > and attached to a specific hook, they can remain there potentially until > reboot of the node, so lifecycle of the user space application != lifecycle > of the BPF program. I don't think the disjoint lifecycle or lack of task ownership is a deal breaker from a LSM perspective as the LSMs can stash whatever info they need in the security pointer during the program allocation hook, e.g. selinux_bpf_prog_alloc() saves the security domain which allocates/loads the BPF program. The thing I'm worried about would be the case where a LSM policy change requires that an existing BPF program be removed or disabled. I'm guessing based on the refcounting that there is not presently a clean way to remove a BPF program from the system, but is this something we could resolve? If we can't safely remove a BPF program from the system, can we replace/swap it with an empty/NULL BPF program? > It's maybe best to compare this aspect to kernel modules in the sense that > you have an application that loads it into the kernel (insmod, etc, where > you could also enforce lockdown signature check), but after that, they can > be managed by other entities as well (implicitly refcounted from kernel, > removed by other applications, etc). Well, I guess we could consider BPF programs as out-of-tree kernel modules that potentially do very odd and dangerous things, e.g. performing access control checks *inside* access control checks ... but yeah, I get your point at a basic level, I just think that comparing BPF programs to kernel modules is a not-so-great comparison in general. > My understanding of the lockdown settings are that users have options > to select/enforce a lockdown level of CONFIG_LOCK_DOWN_KERNEL_FORCE_{INTEGRITY, > CONFIDENTIALITY} at compilation time, they have a lockdown={integrity| > confidentiality} boot-time parameter, /sys/kernel/security/lockdown, > and then more fine-grained policy via 59438b46471a ("security,lockdown,selinux: > implement SELinux lockdown"). Once you have set a global policy level, > you cannot revert back to a less strict mode. I don't recall there being anything in the SELinux lockdown support that prevents a newly loaded policy from allowing a change in the lockdown level, either stricter or more permissive, for a given domain. Looking quickly at the code, that still seems to be the case. The SELinux lockdown access controls function independently of the global build and runtime lockdown configuration. > So the SELinux policy is > specifically tied around tasks to further restrict applications in respect > to the global policy. As a reminder, there is no guarantee that both the SELinux and lockdown LSM are both loaded and active at runtime, it is possible that only SELinux is active. If SELinux is the only LSM enforcing lockdown access controls, there is no global lockdown setting, it is determined per-domain. > I presume that would mean for those users that majority > of tasks have the confidentiality option set via SELinux with just a few > necessary using the integrity global policy. So overall the enforcing > option when BPF program is loaded is the only really sensible option to > me given only there we have the valid current task where such policy can > be enforced. -- paul moore www.paul-moore.com