On Wed, Apr 8, 2020 at 12:42 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > Hi! > > I saw that BPF allows root to create frozen maps, for which the > verifier then assumes that they contain constant values. However, map > freezing is pretty wobbly: > > 1. The syscalls for updating maps from userspace don't seem to lock > the map at all. True, there is a tiny race between freezing and map updates, but I don't think it's possible to solve it without taking locks all around the place in map update operations. > 2. BPF doesn't account for the fact that mprotect() can be used to > arbitrarily flip VM_WRITE on and off as long as VM_MAYWRITE is set. > (crasher attached as bpf-constant-data-mprotect.c) Yeah, my bad. I wasn't aware of VM_MAYWRITE. I'll post a fix dropping VM_MAYWRITE (and VM_MAYEXEC while at that...) for frozen maps. That should fix bpf-constant-mprotect.c > 3. It is assumed that a memory mapping can't be used to write to a > page anymore after the mapping has been removed; but actually, > userspace can grab references to pages in a VMA and use those > references to write to the VMA's pages after the VMA has already been > closed. (crasher attached as bpf-constant-data-uffd.c, compile with > "gcc -pthread ...") Please help me understand how that works (assuming we drop VM_MAYWRITE, of course). You mmap() as R/W, then munmap(), then freeze(). After munmap() refcount of writable pages should drop to zero. And mmap'ed address should be invalid and unmapped. I'm missing how after munmap() parallel thread still can write to that memory page? > > These things are probably not _huge_ concerns for most usecases, since > you need to be root to hit this stuff anyway - but I think it'd be > desirable for BPF to actually be memory-safe (and the kernel lockdown > folks would probably appreciate not having such a glaring hole that > lets root read/write arbitrary memory). > > The third point is particularly hard to solve without adding more > constraints on the userspace API; I think that tightening up map > freezing would require ensuring that the map has *never* been mapped > as writable. I'm clearly missing how memory can remain mmaped() after single mmap() + munmap(), I'd really appreciate if you could elaborate, thanks! > > Is there a reason why the verifier doesn't replace loads from frozen > maps with the values stored in those maps? That seems like it would be > not only easier to secure, but additionally more performant. Verifier doesn't always know exact offset at which program is going to read read-only map contents. So something like this works: const volatile long arr[256]; int x = rand() % 256; int y = arr[x]; In this case verifier doesn't really know the value of y, so it can't be inlined. Then you can have code in which in one branch register is loaded with known value, but in another branch same register gets some value at random offset. Constant tracking is code path-sensitive, while instructions are shared between different code paths. Unless I'm missing what you are proposing :)