Re: [PATCH v10 01/26] Documentation/x86: Add CET description

"H.J. Lu" <hjl.tools@xxxxxxxxx> · Fri, 15 May 2020 19:37:00 -0700

On Fri, May 15, 2020 at 5:13 PM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>
> On 15/05/2020 23:43, Dave Hansen wrote:
> > On 5/15/20 2:33 PM, Yu-cheng Yu wrote:
> >> On Fri, 2020-05-15 at 11:39 -0700, Dave Hansen wrote:
> >>> On 5/12/20 4:20 PM, Yu-cheng Yu wrote:
> >>> Can a binary compiled with CET run without CET?
> >> Yes, but a few details:
> >>
> >> - The shadow stack is transparent to the application.  A CET application does
> >> not have anything different from a non-CET application.  However, if a CET
> >> application uses any CET instructions (e.g. INCSSP), it must first check if CET
> >> is turned on.
> >> - If an application is compiled for IBT, the compiler inserts ENDBRs at branch
> >> targets.  These are nops if IBT is not on.
> > I appreciate the detailed response, but it wasn't quite what I was
> > asking.  Let's ignore IBT for now and just talk about shadow stacks.
> >
> > An app compiled with the new ELF flags and running on a CET-enabled
> > kernel and CPU will start off with shadow stacks allocated and enabled,
> > right?  It can turn its shadow stack off per-thread with the new prctl.
> >  But, otherwise, it's stuck, the only way to turn shadow stacks off at
> > startup would be editing the binary.
> >
> > Basically, if there ends up being a bug in an app that violates the
> > shadow stack rules, the app is broken, period.  The only recourse is to
> > have the kernel disable CET and reboot.
> >
> > Is that right?
>
> If I may interject with the experience of having got supervisor shadow
> stacks working for Xen.
>
> Turning shadow stacks off is quite easy - clear MSR_U_CET.SHSTK_EN and
> the shadow stack will stay in whatever state it was in, and you can
> largely forget about it.  (Of course, in a sandbox scenario, it would be
> prudent to prevent the ability to disable shadow stacks.)
>
> Turning shadow stacks on is much more tricky.  You cannot enable it in
> any function you intend to return from, as the divergence between the
> stack and shadow stack will constitute a control flow violation.
>
>
> When it comes to binaries,  you can reasonably arrange for clone() to
> start a thread on a new stack/shstk, as you can prepare both stacks
> suitably before execution starts.
>
> You cannot reasonably implement a system call for "turn shadow stacks on
> for me", because you'll crash on the ret out of the VDSO from the system
> call.  It would be possible to conceive of an exec()-like system call
> which is "discard my current stack, turn on shstk, and start me on this
> new stack/shstk".
>
> In principle, with a pair of system calls to atomically manage the ststk
> settings and stack switching, it might possible to construct a
> `run_with_shstk_enabled(func, stack, shstk)` API which executes in the
> current threads context and doesn't explode.
>
> Fork() is a problem when shadow stacks are disabled in the parent.  The
> moment shadow stacks are disabled, the regular stack diverges from the
> shadow stack.  A CET-enabled app which turns off shstk and then fork()'s
> must have the child inherit the shstk-off property.  If the child were
> to start with shstk enabled, it would explode almost immediately due to
> the parent's stack divergence which it inherited.
>
>
> Finally seeing as the question was asked but not answered, it is
> actually quite easy to figure out whether shadow stacks are enabled in
> the current thread.
>
>     mov     $1, %eax
>     rdsspd  %eax

This is for 32-bit mode.  I use

        /* Check if shadow stack is in use.  */
        xorl    %esi, %esi
        rdsspq  %rsi
        testq   %rsi, %rsi
        /* Normal return if shadow stack isn't in use.  */
        je      L(no_shstk)

>     cmp     $1, %eax
>     je      no_shstk
>             ...
> no_shsk:
>
> rdssp is allocated from the hint nop encoding space, and the minimum
> alignment of the shadow stack pointer is 4.  On older parts, or with
> shstk disabled (either at the system level, or for the thread), the $1
> will be preserved in %eax, while if CET is active, it will be clobbered
> with something that has the bottom two bits clear.
>
> It turns out this is a lifesaver for codepaths (e.g. the NMI handler)
> which need to use other CET instructions which aren't from the hint nop
> space, and run before the BSP can set everything up.
>
> ~Andrew

-- 
H.J.