Hey Christian, On Wed, Feb 21, 2024 at 08:55:25AM +0100, Christian Brauner wrote: > On Tue, Feb 20, 2024 at 01:22:14PM +0000, Matt Bobrowski wrote: > > On Tue, Feb 20, 2024 at 10:48:10AM +0100, Christian Brauner wrote: > > > On Tue, Feb 20, 2024 at 09:27:23AM +0000, Matt Bobrowski wrote: > > > > There has now been several reported instances [0, 1, 2] where the > > > > usage of the BPF helper bpf_d_path() has led to some form of memory > > > > corruption issue. > > > > > > > > The fundamental reason behind why we repeatedly see bpf_d_path() being > > > > susceptible to such memory corruption issues is because it only > > > > enforces ARG_PTR_TO_BTF_ID constraints onto it's struct path > > > > argument. This essentially means that it only requires an in-kernel > > > > pointer of type struct path to be provided to it. Depending on the > > > > underlying context and where the supplied struct path was obtained > > > > from and when, depends on whether the struct path is fully intact or > > > > not when calling bpf_d_path(). It's certainly possible to call > > > > bpf_d_path() and subsequently d_path() from contexts where the > > > > supplied struct path to bpf_d_path() has already started being torn > > > > down by __fput() and such. An example of this is perfectly illustrated > > > > in [0]. > > > > > > > > Moving forward, we simply cannot enforce KF_TRUSTED_ARGS semantics > > > > onto struct path of bpf_d_path(), as this approach would presumably > > > > lead to some pretty wide scale and highly undesirable BPF program > > > > breakage. To avoid breaking any pre-existing BPF program that is > > > > dependent on bpf_d_path(), I propose that we take a different path and > > > > re-implement an incredibly minimalistic and bare bone version of > > > > d_path() which is entirely backed by kernel probe-read semantics. IOW, > > > > a version of d_path() that is backed by > > > > copy_from_kernel_nofault(). This ensures that any reads performed > > > > against the supplied struct path to bpf_d_path() which may end up > > > > faulting for whatever reason end up being gracefully handled and fixed > > > > up. > > > > > > > > The caveats with such an approach is that we can't fully uphold all of > > > > d_path()'s path resolution capabilities. Resolving a path which is > > > > comprised of a dentry that make use of dynamic names via isn't > > > > possible as we can't enforce probe-read semantics onto indirect > > > > function calls performed via d_op as they're implementation > > > > dependent. For such cases, we just return -EOPNOTSUPP. This might be a > > > > little surprising to some users, especially those that are interested > > > > in resolving paths that involve a dentry that resides on some > > > > non-mountable pseudo-filesystem, being pipefs/sockfs/nsfs, but it's > > > > arguably better than enforcing KF_TRUSTED_ARGS onto bpf_d_path() and > > > > causing an unnecessary shemozzle for users. Additionally, we don't > > > > > > NAK. We're not going to add a semi-functional reimplementation of > > > d_path() for bpf. This relied on VFS internals and guarantees that were > > > never given. Restrict it to KF_TRUSTED_ARGS as it was suggested when > > > this originally came up or fix it another way. But we're not adding a > > > bunch of kfuncs to even more sensitive VFS machinery and then build a > > > d_path() clone just so we can retroactively justify broken behavior. > > > > OK, I agree, having a semi-functional re-implementation of d_path() is > > indeed suboptimal. However, also understand that slapping the > > The ugliness of the duplicated code made me start my mail with NAK. It > would've been enough to just say no. > > > KF_TRUSTED_ARGS constraint onto the pre-existing BPF helper > > bpf_d_path() would outright break a lot of BPF programs out there, so > > I can't see how taht would be an acceptable approach moving forward > > here either. > > > > Let's say that we decided to leave the pre-existing bpf_d_path() > > implementation as is, accepting that it is fundamentally succeptible > > to memory corruption issues, are you saying that you're also not for > > adding the KF_TRUSTED_ARGS d_path() variant as I've done so here > > No, that's fine and was the initial proposal anyway. You're already > using the existing d_path() anway in that bpf_d_path() thing. So > exposing another variant with KF_TRUSTED_ARGS restriction is fine. But > not hacking up a custom d_path() variant. OK, thank you for clarifying. Perhaps we should just make a remark in the form of a comment against bpf_d_path() stating that this BPF helper is considered unsafe and users should look to migrate to the newly added KF_TRUSTED_ARGS variant if at all possible. > > [0]. Or, is it the other supporting reference counting based BPF > > kfuncs [1, 2] that have irked you and aren't supportive of either? > > Yes, because you're exposing fs_root, fs_pwd, path_put() and fdput(), > get_task_exe_file(), get_mm_exe_file(). None of that I see being turned > into kfuncs. Hm, OK, but do know that BPF kfuncs do not make any promises around being a stable interface, they never have and never will. Therefore, it's not like introducing this kind of dependency on such APIs from BPF kfuncs would hinder you from fundamentally modifying them moving forward? Additionally, given that these new acquire/release based BPF kfuncs which rely on APIs like get_fs_root() and path_put() are in fact restricted to BPF LSM programs, usage of such BPF kfuncs from the context of a BPF LSM program would rather be analogous to a pre-existing LSM module calling get_fs_root() and path_put() explicitly within one of its implemented hooks, no? IOW, once a BPF LSM program is loaded and JITed, what's the fundamental difference between a baked in LSM module hook implementation which calls get_fs_root() and a BPF LSM program which calls bpf_get_task_fs_root()? They're both being used in a perfectly reasonable and sane like-for-like context, so what's the issue with exposing such APIs as BPF kfuncs if they're being used appropriately? It really doesn't make sense to provide independent reference counting implementations just for BPF if there's some pre-existing infrastructure in the kernel that does it the right way. Also note that without such new reference counting BPF kfuncs which I've proposed within this patch series the KF_TRUSTED_ARGS variant of bpf_d_path() that we've agreed on becomes somewhat difficult to use in practice. It'd essentially only be usable from LSM hooks that pass in a struct path via the context parameter. Whilst in reality, it's considered rather typical to also pass a struct path like ¤t->mm->exe_file->f_path and ¤t->fs->pwd to bpf_d_path() and friends from within the the implementation of an LSM hook. Such struct path objects nested some levels deep isn't considered as being trusted and therefore cannot be passed to a BPF kfunc that enforces the KF_TRUSTED_ARGS constraint. The only way to acquire trust on a pointer after performing such a struct walk is by grabbing a reference on it, and hence why this KF_TRUSTED_ARGS change to d_path() and these new BPF kfuncs go hand in hand. Apologies about all the questions and comments here, but I'm really just trying to understand why there's so much push back with regards adding these reference counting BPF kfuncs? /M