On Thu, Jan 28, 2021 at 5:53 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Thu, Jan 28, 2021 at 05:31:35PM -0800, Andy Lutomirski wrote: > > > > What exactly could the fault code even do to fix this up? Something like: > > > > if (addr == 0 && SMAP off && error_code says it's kernel mode && we > > don't have permission to map NULL) { > > special care for bpf; > > } > > right. where 'special care' is checking extable and skipping single > load instruction. > > > This seems arbitrary and fragile. And it's not obviously > > implementable safely without taking locks, > > search_bpf_extables() needs rcu_read_lock() only. > Not the locks you're talking about. I mean the locks in the if statement. How am I supposed to tell whether this fault is a special bpf fault or a normal page fault without taking a lock to look up the VMA or to do some other hack? Also, why should BPF get special dispensation to generate a special kind of nonsensical page fault that no other kernel code is allowed to do? > > > which we really ought not > > to be doing from inside arbitrary bpf programs. > > Not in arbitrary progs. It's only available to progs loaded by root. > > > Keep in mind that, if > > SMAP is unavailable, the fault code literally does not know whether > > the page fault originated form a valid uaccess region. > > > > There's also always the possibility that you simultaneously run bpf > > and something like Wine or DOSEMU2 that actually maps the zero page, > > in which case the effect of the bpf code is going to be quite erratic > > and will depend on which mm is loaded. > > It's tracing. Folks who write those progs accepted the fact that > the data returned by probe_read is not going to be 100% accurate. Is this really all tracing or is some of it actual live network code? > > bpf jit can insert !null check before every such special load > (since it knows all of them), but it's an obvious performance loss > that would be good to avoid. If fault handler can do this > if (addr == 0 && ...) search_bpf_extables() > at least in some conditions and cpu flags it's already a win. > In all other cases bpf jit will insert !null checks. Again, there is no guarantee that a page fault is even generated for this. And it doesn't seem very reasonable for the fault code to have to decide whether a NULL pointer dereference is a special BPF fault or a real user access fault against a VMA at address 9. --Andy