Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

Ingo Molnar <mingo@xxxxxxxxxx> · Tue, 23 Jan 2018 10:27:56 +0100

* Ingo Molnar <mingo@xxxxxxxxxx> wrote:

> Is there a testcase for the SkyLake 16-deep-call-stack problem that I could run? 
> Is there a description of the exact speculative execution vulnerability that has 
> to be addressed to begin with?

Ok, so for now I'm assuming that this is the 16 entries return-stack-buffer 
underflow condition where SkyLake falls back to the branch predictor (while other 
CPUs wrap the buffer).

> If this approach is workable I'd much prefer it to any MSR writes in the syscall 
> entry path not just because it's fast enough in practice to not be turned off by 
> everyone, but also because everyone would agree that per function call overhead 
> needs to go away on new CPUs. Both deployment and backporting is also _much_ more 
> flexible, simpler, faster and more complete than microcode/firmware or compiler 
> based solutions.
> 
> Assuming the vulnerability can be addressed via this route that is, which is a big 
> assumption!

So I talked this over with PeterZ, and I think it's all doable:

 - the CALL __fentry__ callbacks maintain the depth tracking (on the kernel 
   stack, fast to access), and issue an "RSB-stuffing sequence" when depth reaches
   16 entries.

 - "the RSB-stuffing sequence" is a return trampoline that pushes a CALL on the 
   stack which is executed on the RET.

 - All asynchronous contexts (IRQs, NMIs, etc.) stuff the RSB before IRET. (The 
   tracking could probably made IRQ and maybe even NMI safe, but the worst-case 
   nesting scenarios make my head ache.)

I.e. IBRS can be mostly replaced with a kernel based solution that is better than 
IBRS and which does not negatively impact any other non-SkyLake CPUs or general 
code quality.

I.e. a full upstream Spectre solution.

Thanks,

	Ingo