On Sat, Jan 5, 2019 at 12:10 AM Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote: > > On Wed, Jan 02, 2019 at 02:37:58PM +0100, Dmitry Vyukov wrote: > > If we are proceeding with "mm: some enhancements to the page fault > > mechanism", that's good as it will eliminate at least part of this > > output. > > Agreed. > > > There are 2 types of debug configs: ones add additional checks for > > machines and another add verbose output for humans. CONFIG_DEBUG_VM > > seems to be more of the first type of debug config -- additional > > checks for machines. I've seen some of the second type debug configs > > are prefixed with DEBUG_VERBOSE or something along these lines. Maybe > > it makes sense to split this out of CONFIG_DEBUG_VM. Since it's a > > "gray" check (rather then white/black check), it can't be used in > > CI/fuzzing setups anyways -- not possible to analyse thousands of > > cases manually (though maybe we actually hitting some that can be > > classified as kernel bugs). > > The problem with the DEBUG_VERBOSE is that if the kernel needs a > rebuild it's only marginally better than having to ask the developer > to add a one liner dump_stack before rebuilding the kernel. > > Next time we need it, it may be simpler to figure a dynamic tracing > trick than to ask a rebuild. > > In addition of pursuing the VM_FAULT_RETRY improvements from Peter, > it'd be fine to drop it for now to avoid confusing the machines. > > > Yes, the problem is way more general. As you noted it applies to 100K > > of EINVAL|EFAULT|ENOMEM, it's super hard to figure out what/where > > exactly goes wrong in the kernel getting only -22. But at the same > > What stands out for this location is that it's the bailout point > of all non uffd compatible syscalls. > > The chances such annotation turns out to be useful is much higher than > a random -EINVAL location, but in principle it's the same kind of > issue and I agree it'd be unpractical to annotate them manually. > > > time we can't have all of these 100K places dump stacks. I don't know > > what's a good solution for this. Manually annotating 100K places does > > not look like the right way to go. Maybe kprobes can do this? In some > > cases I used CONFIG_KCOV and kcovtrace > > (https://github.com/google/syzkaller/blob/master/tools/kcovtrace/kcovtrace.c) > > to collect kernel trace from a failing syscall. > > If there is a de facto standard to search for a "syscall bailout call > trace" that can provide us that very same dump_stack, that should work > for this uffd issue too indeed. > > KCOV may or may not be enabled in enterprise -debug kernels, but the > main limitation of the kcovtrace.c seems to be the lack of threading > support and the privilege inheritance through fork. While it's not > mandatory, the uffd manager is practically always implemented as a > thread of some process so whatever alternative to dump_stack() should > work with threads. > > I think kprobes/ebpf should work provided debug info is available > (which is not always guaranteed), otherwise to obtain it without debug > info, it'd require a ftrace static trace point to use with the > function graph tracer I guess. OK, not sure what are the action points here now besides marking this bug report as dup: #syz dup: KASAN: use-after-free Read in neigh_mark_dead