* Ingo Molnar (mingo@xxxxxxx) wrote: > > * Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote: > > > I am not asking for the pf handler to handle every possible kind > > of fault recursively. Just to keep the in-kernel page fault > > related code for vmalloc (and possibly for prefetch ?) paths > > NMI-reentrant : > > > > void do_page_fault(struct pt_regs *regs, unsigned long error_code) > > > > address = read_cr2(); > > Why would this be needed? We read the cr2 as the first thing in > do_page_fault(). It can be destroyed and re-faulted at will after > that point, it wont matter a bit - we have already read it. > With respect to cr2, yes, this is the only window we care about. However, the rest of vmalloc_fault() must be audited for other non nmi-suitable data structure use (e.g. "current"), which I did in the past. My intent was just to respond to Peter's concerns by showing that the part of page fault handler which needs to be NMI-reentrant is really not that big. Mathieu > The only window to be careful about wrt. cr2 is the small window > starting at <page_fault>, leading into <do_page_fault>: > > ffffffff8154085f <do_page_fault>: > ffffffff8154085f: 55 push %rbp > ffffffff81540860: 48 89 e5 mov %rsp,%rbp > ffffffff81540863: 41 57 push %r15 > ffffffff81540865: 41 56 push %r14 > ffffffff81540867: 49 89 f6 mov %rsi,%r14 > ffffffff8154086a: 41 55 push %r13 > ffffffff8154086c: 49 89 fd mov %rdi,%r13 > ffffffff8154086f: 41 54 push %r12 > ffffffff81540871: 53 push %rbx > ffffffff81540872: 48 83 ec 18 sub $0x18,%rsp > ffffffff81540876: 65 4c 8b 3c 25 00 b0 mov %gs:0xb000,%r15 > ffffffff8154087d: 00 00 > ffffffff8154087f: 49 8b 87 48 02 00 00 mov 0x248(%r15),%rax > ffffffff81540886: 48 89 45 d0 mov %rax,-0x30(%rbp) > ffffffff8154088a: 48 83 c0 60 add $0x60,%rax > ffffffff8154088e: 48 89 45 c8 mov %rax,-0x38(%rbp) > ffffffff81540892: 0f 18 08 prefetcht0 (%rax) > ffffffff81540895: 41 0f 20 d4 mov %cr2,%r12 > > Look how early we read out cr2 - after trapping we read it after > about 40 straight instructions, with no other function call > inbetween. Only an NMI (or an MCE and similar deep-atomic contexts) > can get in that window. > > ( Btw., a sidenote: the prefetcht0 right before the cr2 read is a > real bug. Prefetches can sometimes generate false faults and thus > destroy the value cr2. I'll send a patch for that soon. ) > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html