Re: Patch "x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack" has been added to the 3.15-stable tree

Andrew Lutomirski <amluto@xxxxxxxxx> · Mon, 14 Jul 2014 19:56:44 -0700

On Mon, Jul 14, 2014 at 7:40 PM, Boris Ostrovsky
<boris.ostrovsky@xxxxxxxxxx> wrote:
> On 07/14/2014 10:30 PM, Andrew Lutomirski wrote:
>>
>> On Mon, Jul 14, 2014 at 7:21 PM, Boris Ostrovsky
>> <boris.ostrovsky@xxxxxxxxxx> wrote:
>>>
>>> On 07/14/2014 06:47 PM, H. Peter Anvin wrote:
>>>>
>>>> On 07/14/2014 02:43 PM, Andrew Lutomirski wrote:
>>>>>>
>>>>>> That patch is running through my build tests as we speak.  I expect to
>>>>>> push it to tip:x86/urgent in about an hour.
>>>>>
>>>>> I think that espfix64 is completely broken on Xen, regardless of the
>>>>> pud vs pmd issue :(  See:
>>>>>
>>>>>
>>>>>
>>>>> http://lkml.kernel.org/g/CALCETrWG-dQL8ipJ8cO3wfbYKA=mAv3CS4-1JFwmBXF3pUbAwg@xxxxxxxxxxxxxx
>>>>>
>>> This is exactly the problem that the patch is fixing:
>>>      paravirt_alloc_pte(&init_mm, __pa(stack_page) >> PAGE_SHIFT)
>>> that we are removing is trying claim that stack_page is going to be the
>>> PTE
>>> page (i.e. the last level). So the hypervisor will mark it read-only.
>>
>> I don't follow.  The espfix64 stack is read-only by design.
>
>
> Isn't 'movq %rax,(0*8)(%rdi)' from
> http://article.gmane.org/gmane.linux.kernel/1746680 trying to write
> stack_page?

Right, sorry.  There are two aliases of the stack.  The one in
entry_64.S is writing to the writable alias (which is presumably
screwed up without this patch).  The one in xen-asm_64.S is writing to
the stack pointed to by rsp, which is read-only by then.

(Disclaimer: I seem to have the flu.  Any and all technical statements
I make may be arbitrarily incorrect.

>>>> And it is also completely unclear to me if it is actually necessary.
>>>> Again, I would like to know how the Xen IRET pvop actually handles a
>>>> 16-bit stack segment.  If it restores all of RSP then espfix isn't
>>>> necessary.
>>>
>>>
>>> Last time I looked at this I thought it was not necessary since the iret
>>> handler in the hypervisor copies saved RSP. I may be wrong though.
>>
>> How?  This is a CPU misdesign we're talking about.
>
>
> I was referring to IRET hypecall :
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/x86_64/traps.c;h=650c33d3da7eb8a76d26a1e4a934b32e67587a40;hb=HEAD
>
> There may be other paths, but that's the one that I had in mind.

The issue here is that, when the target SS is 16-bit, IRET may not
restore all the bits of RSP.  espfix64 is a moderately complicated
hack that restores bits 31:16 of RSP before issuing iret to userspace
(if SS is an LDT descriptor) so that userspace sees the correct values
regardless of whether IRET restores those bits.

Since IRET is actually a hypercall on Xen guests, this hack may be
entirely ineffective, even if it didn't crash: the Xen hypervisor will
just load some other value into bits 31:16 before doing IRET for real.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html