* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > Then, the NMI handler would be changed to always write that value > to %cr2 after it has done the operation that could fault, and do > an atomic increment of the NMI sequence count. Then, we can do > something like this in the page fault handler: > > if (cr2 == MAGIC_CR2) { > static unsigned long my_seqno = -1; > if (my_seqno != nmi_seqno) { > my_seqno = nmi_seqno; > return; > } > } > > where the whole (and only) point of that "seqno" is to protect against > user space doing something like > > int i = *(int *)MAGIC_CR2; > > and causing infinite faults. Heh - this is so tricky that it's disgusting! Lovely. And, since this appears to be a competition of sick ideas, an even more disgusting hack might be to write to the IDT from the NMI handler, and install a NULL entry at #PF and rely on the double fault handler to detect faults - double faults dont clobber the cr2 i think ... ( I think to protect the fragile and pure fabric of lkml against moral corruption, disgusting patches must remain unsent and disgusting ideas like this must absolutely stay unspoken. Hence i have removed lkml from the Cc:. [Oops i didnt ... too late, and this mail has already been sent! :-/ ]) > If a real NMI happens, then nmi_seqno will always be different, > and we'll just retry the fault (the NMI handler would do something > like > > write_cr2(MAGIC_CR2); > atomic_inc(&nmi_seqno); > > to set it all up). > > Anyway, I do think that the _correct_ solution is to not do page > faults from within NMI's, but the above is an outline of how we > could _try_ to handle it if we really really wanted to. IOW, the > fact that cr2 gets corrupted is not insurmountable, exactly > because we _could_ always just retrigger the page fault, and thus > "re-create' the corrupted %cr2 value. > > Hacky, hacky. And I'm not sure how happy CPU's even are to have > %cr2 written to, so we could hit CPU issues. If cr2 cannot be safely written to on a CPU, that could be worked around by counting the number of NMIs via a percpu_add(this_nmi_count, 1) and retrying faults if any NMI happened between the previous fault and this fault. This has the disadvantage of potentially doubling the number of pagefaults though. But it would certainly work as a tricky quirk to this quirk which is added to a rather quirky code-path to begin with. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html