* Ingo Molnar <mingo@xxxxxxxxxx> wrote: > Is there a testcase for the SkyLake 16-deep-call-stack problem that I could run? > Is there a description of the exact speculative execution vulnerability that has > to be addressed to begin with? Ok, so for now I'm assuming that this is the 16 entries return-stack-buffer underflow condition where SkyLake falls back to the branch predictor (while other CPUs wrap the buffer). > If this approach is workable I'd much prefer it to any MSR writes in the syscall > entry path not just because it's fast enough in practice to not be turned off by > everyone, but also because everyone would agree that per function call overhead > needs to go away on new CPUs. Both deployment and backporting is also _much_ more > flexible, simpler, faster and more complete than microcode/firmware or compiler > based solutions. > > Assuming the vulnerability can be addressed via this route that is, which is a big > assumption! So I talked this over with PeterZ, and I think it's all doable: - the CALL __fentry__ callbacks maintain the depth tracking (on the kernel stack, fast to access), and issue an "RSB-stuffing sequence" when depth reaches 16 entries. - "the RSB-stuffing sequence" is a return trampoline that pushes a CALL on the stack which is executed on the RET. - All asynchronous contexts (IRQs, NMIs, etc.) stuff the RSB before IRET. (The tracking could probably made IRQ and maybe even NMI safe, but the worst-case nesting scenarios make my head ache.) I.e. IBRS can be mostly replaced with a kernel based solution that is better than IBRS and which does not negatively impact any other non-SkyLake CPUs or general code quality. I.e. a full upstream Spectre solution. Thanks, Ingo