On 11/4/24 09:22, Shah, Amit wrote: >> I think you're wrong. We can't depend on ERAPS for this. Linux >> doesn't flush the TLB on context switches when PCIDs are in play. >> Thus, ERAPS won't flush the RSB and will leave bad state in there >> and will leave the system vulnerable. >> >> Or what am I missing? > I just received confirmation from our hardware engineers on this too: > > 1. the RSB is flushed when CR3 is updated > 2. the RSB is flushed when INVPCID is issued (except type 0 - single > address). > > I didn't mention 1. so far, which led to your question, right? Not only did you not mention it, you said something _completely_ different. So, where the documentation for this thing? I dug through the 57230 .zip file and I see the CPUID bit: 24 ERAPS. Read-only. Reset: 1. Indicates support for enhanced return address predictor security. but nothing telling us how it works. > Does this now cover all the cases? Nope, it's worse than I thought. Look at: > SYM_FUNC_START(__switch_to_asm) ... > FILL_RETURN_BUFFER %r12, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW which does the RSB fill at the same time it switches RSP. So we feel the need to flush the RSB on *ALL* task switches. That includes switches between threads in a process *AND* switches over to kernel threads from user ones. So, I'll flip this back around. Today, X86_FEATURE_RSB_CTXSW zaps the RSB whenever RSP is updated to a new task stack. Please convince me that ERAPS provides superior coverage or is unnecessary in all the possible combinations switching between: different thread, same mm user=>kernel, same mm kernel=>user, same mm different mm (we already covered this) Because several of those switches can happen without a CR3 write or INVPCID.