Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:

> But wait, why did I say "mostly"? Well, not everyone has a retpoline
> compiler yet... but OK, screw them; they need to update.
> 
> Then there's Skylake, and that generation of CPU cores. For complicated
> reasons they actually end up being vulnerable not just on indirect
> branches, but also on a 'ret' in some circumstances (such as 16+ CALLs
> in a deep chain).
> 
> The IBRS solution, ugly though it is, did address that. Retpoline
> doesn't. There are patches being floated to detect and prevent deep
> stacks, and deal with some of the other special cases that bite on SKL,
> but those are icky too. And in fact IBRS performance isn't anywhere
> near as bad on this generation of CPUs as it is on earlier CPUs
> *anyway*, which makes it not quite so insane to *contemplate* using it
> as Intel proposed.

There's another possible method to avoid deep stacks on Skylake, without compiler 
support:

  - Use the existing mcount based function tracing live patching machinery
    (CONFIG_FUNCTION_TRACER=y) to install a _very_ fast and simple stack depth 
    tracking tracer which would issue a retpoline when stack depth crosses 
    boundaries of ~16 entries.

The overhead of that would _still_ very likely be much cheaper than a hundreds 
(thousands) of cycle expensive MSR write at every kernel entry (syscall entry, IRQ 
entry, etc.).

Note the huge number of advantages:

 - All distro kernels already enable the mcount based patching options, so there's
   literally zero overhead to anything except SkyLake.

 - It is fully kernel patching based and can be activated on Skylake only

 - It doesn't require any microcode updates, so it will work on all existing CPUs
   with no firmware or microcode modificatons

 - It doesn't require any compiler updates

 - SkyLake performance is very likely to be much less fragile than relying on a 
   hastily deployed microcode hack

 - The "SkyLake stack depth tracer" can be tested on other CPUs as well in debug 
   builds, broadening the testing base

 - The tracer is very obviously simple and reviewable, and we can forget about it
   in the far future.

 - It's much more backportable to older kernels: should there be a new class of
   exploits then this machinery could be updated to cover that too - while 
   upgrades to newer kernels would give the higher performant solution.

Yes, there are some practical complications like always enabling 
CONFIG_FUNCTION_TRACER=y on x86, plus the ftrace interaction has to be sorted out, 
but in practice it's enabled on all major distros anyway, due to ftrace.

Is there any reason why this wouldn't work?

Thanks,

	Ingo



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux