On Mon, 2024-11-04 at 09:45 -0800, Dave Hansen wrote: > On 11/4/24 09:22, Shah, Amit wrote: > > > I think you're wrong. We can't depend on ERAPS for this. Linux > > > doesn't flush the TLB on context switches when PCIDs are in play. > > > Thus, ERAPS won't flush the RSB and will leave bad state in there > > > and will leave the system vulnerable. > > > > > > Or what am I missing? > > I just received confirmation from our hardware engineers on this > > too: > > > > 1. the RSB is flushed when CR3 is updated > > 2. the RSB is flushed when INVPCID is issued (except type 0 - > > single > > address). > > > > I didn't mention 1. so far, which led to your question, right? > > Not only did you not mention it, you said something _completely_ > different. So, where the documentation for this thing? I dug > through > the 57230 .zip file and I see the CPUID bit: > > 24 ERAPS. Read-only. Reset: 1. Indicates support for > enhanced > return address predictor security. > > but nothing telling us how it works. I'm expecting the APM update come out soon, but I have put together https://amitshah.net/2024/11/eraps-reduces-software-tax-for-hardware-bugs/ based on information I have. I think it's mostly consistent with what I've said so far - with the exception of the mov-CR3 flush only confirmed yesterday. > > Does this now cover all the cases? > > Nope, it's worse than I thought. Look at: > > > SYM_FUNC_START(__switch_to_asm) > ... > > FILL_RETURN_BUFFER %r12, RSB_CLEAR_LOOPS, > > X86_FEATURE_RSB_CTXSW > > which does the RSB fill at the same time it switches RSP. > > So we feel the need to flush the RSB on *ALL* task switches. That > includes switches between threads in a process *AND* switches over to > kernel threads from user ones. (since these cases are the same as those listed below, I'll only reply in one place) > So, I'll flip this back around. Today, X86_FEATURE_RSB_CTXSW zaps > the > RSB whenever RSP is updated to a new task stack. Please convince me > that ERAPS provides superior coverage or is unnecessary in all the > possible combinations switching between: > > different thread, same mm This case is the same userspace process with valid addresses in the RSB for that process. An invalid speculation isn't security sensitive, just a misprediction that won't be retired. So we are good here. > user=>kernel, same mm > kernel=>user, same mm user-kernel is protected with SMEP. Also, we don't call FILL_RETURN_BUFFER for these switches? > different mm (we already covered this) > > Because several of those switches can happen without a CR3 write or > INVPCID. (that covers all of them IIRC) Amit