On Fri, Oct 27, 2023 at 10:13 AM Jann Horn <jannh@xxxxxxxxxx> wrote: > > Hi! > > bpf_d_path() can be invoked on a "struct path" that results from > following a pointer chain involving pointers that can concurrently > change; this can lead to stuff like use-after-free in d_path(). > > For example, the BPF verifier permits stuff like > bpf_d_path(¤t->mm->exe_file->f_path, ...), which is not actually > safe in many contexts: > > current->mm->exe_file can concurrently change; so by the time > bpf_d_path() is called, the file's refcount might have gone to zero, > and __fput() may have already mostly torn down the file. "struct file" > currently has some limited RCU lifetime, but that is supposed to be an > implementation detail that BPF shouldn't be relying on, and "struct > file" will soon have even less RCU lifetime than before (see > <https://lore.kernel.org/all/20230930-glitzer-errungenschaft-b86880c177c4@brauner/>). > > When __fput() tears down a file, it drops the references held by > file->f_path.mnt and file->f_path.dentry. "struct vfsmount" has some > kind of RCU lifetime, but "struct dentry" will be freed directly in > dentry_free() if it has DCACHE_NORCU set, which is the case if it was > allocated via d_alloc_pseudo(), which is how memfd files are > allocated. > > So the following race is possible, if we start in a situation where > current->mm->exe_file points to a memfd: > > thread A thread B > ======== ======== > begin RCU section > begin BPF program > compute path = ¤t->mm->exe_file->f_path > > prctl(PR_SET_MM, PR_SET_MM_MAP, ...) > updates current->mm->exe_file > calls fput() on old ->exe_file > __fput() runs > dput(dentry); > mntput(mnt) > > invoke helper bpf_d_path(path, ...) > d_path() > reads path->dentry->d_op *** UAF read *** > reads path->dentry->d_op->d_dname *** read through wild pointer *** > path->dentry->d_op->d_dname(...) *** wild pointer call *** > > So if an attacker managed to reallocate the old "struct dentry" with > attacker-controlled data, they could probably get the kernel to call > an attacker-provided function pointer, eventually letting an attacker > gain kernel privileges. > > Obviously this is not a bug an unprivileged attacker can just hit > directly on a system where no legitimate BPF programs are already > running, because loading tracing BPF programs requires privileges; but > if a privileged process loads a tracing BPF program that does > something unsafe like "bpf_d_path(¤t->mm->exe_file->f_path, > ...)", an attacker might be able to leverage that. Thanks for the report. That's a verifier bug indeed. Curious, did you actually see such broken bpf program or this is theoretical issue in case somebody will write such thing ? > > If BPF wants to be able to promise that buggy BPF code can't crash the > kernel (or, worse, introduce privilege escalation vulnerabilities in > the kernel), Only the former. The verifier cannot possibly guarantee that the bpf-lsm program or tracing bpf prog is not leaking addresses or acting maliciously. Same in networking. XDP prog might be doing firewalling incorrectly, dropping wrong packets, disabling ssh when it shouldn't, etc. We cannot validate semantics. The verifier tries to guarantee non-crash only. Hence loading bpf prog is a privileged operation. But back to the verifier bug... I suspect it will be very hard to craft a test that does prctl(PR_SET_MM) and goes all the way through the delayed fput logic on one cpu while bpf prog under rcu_read_lock calls bpf_d_path on the other cpu. I can see this happening in theory and we need to close this verification gap, but we need to be realistic in assessing the severity of it. To fix it we need to make bpf_d_path KF_TRUSTED_ARGS. All new kfuncs are done this way already. They don't allow unrestricted pointer walks.