Re: [PATCH 05/39] x86/entry/32: Unshare NMI return path

Andy Lutomirski <luto@xxxxxxxxxxxxxx> · Fri, 13 Jul 2018 10:26:30 -0700

On Fri, Jul 13, 2018 at 3:05 AM, Joerg Roedel <joro@xxxxxxxxxx> wrote:
> On Thu, Jul 12, 2018 at 01:53:19PM -0700, Andy Lutomirski wrote:
>> > On Jul 11, 2018, at 4:29 AM, Joerg Roedel <joro@xxxxxxxxxx> wrote:
>> > NMI will no longer use most of the shared return path,
>> > because NMI needs special handling when the CR3 switches for
>> > PTI are added.
>>
>> Why?  What would go wrong?
>>
>> How many return-to-usermode paths will we have?  64-bit has only one.
>
> In the non-NMI return path we make a decission on whether we return to
> user-space or kernel-space and do different things based on that. For
> example, when returning to user-space we call
> prepare_exit_to_usermode(). With the CR3 switches added later we also
> unconditionally switch to user-cr3 when we are in the return-to-user
> path.
>
> The NMI return path does not need any of that, as it doesn't call
> prepare_exit_to_usermode() even when it returns to user-space. It
> doesn't even care where it returns to. It just remembers stack and cr3
> on entry in callee-safed registers and restores that on exit. This works
> in the NMI path because it is pretty simple and doesn't do any fancy
> work on exit.
>
> While working on a previous version I also tried to store stack and cr3
> in a callee-safed register and restore that on exit again, but it didn't
> work, most likley because something in-between overwrote one of the
> registers. I also found it a bit fragile to make make two registers
> untouchable in the whole entry-code. It doesn't make future changes
> simpler or more robust.
>
> So long story short, the NMI path can be simpler wrt. stack and cr3
> handling as the other entry/exit points, and therefore it is handled
> differently.
>
>

We used to do it this way on 64-bit, but I had to change it because of
a nasty case where we *fail* the return to user mode when we're
returning from an NMI.  In theory this can't happen any more due to a
bunch of tightening up of the way we handle segmentation, but it's
still quite nasty.  The whole situation on 32-bit isn't quite as
fragile because espfix32 is much more robust than espfix64.

So I suppose this is okay, but I wouldn't be totally shocked if we
need to redo it down the road.