On Thu, 28 Feb 2019, Jann Horn wrote: > +Josh for unwinding, +x86 folks > On Wed, Feb 27, 2019 at 11:43 PM Andrew Morton > <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, 21 Feb 2019 06:52:04 -0800 syzbot <syzbot+ca95b2b7aef9e7cbd6ab@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > Hello, > > > > > > syzbot found the following crash on: > > > > > > HEAD commit: 4aa9fc2a435a Revert "mm, memory_hotplug: initialize struct.. > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1101382f400000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=4fceea9e2d99ac20 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=ca95b2b7aef9e7cbd6ab > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > Not understanding. That seems to be saying that we got a NULL pointer > > deref in __generic_file_write_iter() at > > > > written = generic_perform_write(file, from, iocb->ki_pos); > > > > which isn't possible. > > > > I'm not seeing recent changes in there which could have caused this. Help. > > + > > Maybe the problem is that the frame pointer unwinder isn't designed to > cope with NULL function pointers - or more generally, with an > unwinding operation that starts before the function's frame pointer > has been set up? > > Unwinding starts at show_trace_log_lvl(). That begins with > unwind_start(), which calls __unwind_start(), which uses > get_frame_pointer(), which just returns regs->bp. But that frame > pointer points to the part of the stack that's storing the address of > the caller of the function that called NULL; the caller of NULL is > skipped, as far as I can tell. > > What's kind of annoying here is that we don't have a proper frame set > up yet, we only have half a stack frame (saved RIP but no saved RBP). That wreckage is related to the fact that the indirect calls are going through __x86_indirect_thunk_$REG. I just verified on a VM with some other callback NULL'ed that the resulting backtrace is not really helpful. So in that case generic_perform_write() has two indirect calls: mapping->a_ops->write_begin() and ->write_end() Thanks, tglx