On Fri, 10 Nov 2017, Dave Hansen wrote: > From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > > This is largely code from Andy Lutomirski. I fixed a few bugs > in it, and added a few SWITCH_TO_* spots. > > KAISER needs to switch to a different CR3 value when it enters > the kernel and switch back when it exits. This essentially > needs to be done before leaving assembly code. > > This is extra challenging because the switching context is > tricky: the registers that can be clobbered can vary. It is also > hard to store things on the stack because there is an established > ABI (ptregs) or the stack is entirely unsafe to use. Changelog nitpicking starts here > This patch establishes a set of macros that allow changing to s/This patch establishes/Establish/ > the user and kernel CR3 values. > > Interactions with SWAPGS: previous versions of the KAISER code > relied on having per-cpu scratch space to save/restore a register > that can be used for the CR3 MOV. The %GS register is used to > index into our per-cpu space, so SWAPGS *had* to be done before s/our/the/ > the CR3 switch. That scratch space is gone now, but the semantic > that SWAPGS must be done before the CR3 MOV is retained. This is > good to keep because it is not that hard to do and it allows us s/us// > to do things like add per-cpu debugging information to help us > figure out what goes wrong sometimes. the part after 'information' is fairy tale mode and redundant. Debugging information says it all, right? > What this does in the NMI code is worth pointing out. NMIs > can interrupt *any* context and they can also be nested with > NMIs interrupting other NMIs. The comments below > ".Lnmi_from_kernel" explain the format of the stack during this > situation. Changing the format of this stack is not a fun > exercise: I tried. Instead of storing the old CR3 value on the > stack, this patch depend on the *regular* register save/restore > mechanism and then uses %r14 to keep CR3 during the NMI. It is > callee-saved and will not be clobbered by the C NMI handlers that > get called. The comments below ".Lnmi_from_kernel" explain the format of the stack during this situation. Changing this stack format is too complex and risky, so the following solution has been used: Instead of storing the old CR3 value on the stack, depend on the regular register save/restore mechanism and use %r14 to hold CR3 during the NMI. r14 is callee-saved and will not be clobbered by the C NMI handlers that get called. End of nitpicking > +.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req > + movq %cr3, %r\scratch_reg > + movq %r\scratch_reg, \save_reg > + /* > + * Is the switch bit zero? This means the address is > + * up in real KAISER patches in a moment. * If the switch bit is zero, CR3 points at the kernel page tables * already. Hmm? > /* > @@ -1189,6 +1201,7 @@ ENTRY(paranoid_exit) > testl %ebx, %ebx /* swapgs needed? */ > jnz .Lparanoid_exit_no_swapgs > TRACE_IRQS_IRETQ > + RESTORE_CR3 %r14 You have the named macro arguments everywhere, just not here. Other than that. Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>