On 11/5/24 02:39, Shah, Amit wrote: > On Mon, 2024-11-04 at 09:45 -0800, Dave Hansen wrote: > I'm expecting the APM update come out soon, but I have put together > > https://amitshah.net/2024/11/eraps-reduces-software-tax-for-hardware-bugs/ > > based on information I have. I think it's mostly consistent with what > I've said so far - with the exception of the mov-CR3 flush only > confirmed yesterday. That's better. But your original cover letter did say: Feature documented in AMD PPR 57238. which is technically true because the _bit_ is defined. But it's far, far from being sufficiently documented for Linux to actually use it. Could we please be more careful about these in the future? >> So, I'll flip this back around. Today, X86_FEATURE_RSB_CTXSW zaps >> the >> RSB whenever RSP is updated to a new task stack. Please convince me >> that ERAPS provides superior coverage or is unnecessary in all the >> possible combinations switching between: >> >> different thread, same mm > > This case is the same userspace process with valid addresses in the RSB > for that process. An invalid speculation isn't security sensitive, > just a misprediction that won't be retired. So we are good here. Does that match what the __switch_to_asm comment says, though? > /* > * When switching from a shallower to a deeper call stack > * the RSB may either underflow or use entries populated > * with userspace addresses. On CPUs where those concerns > * exist, overwrite the RSB with entries which capture > * speculative execution to prevent attack. > */ It is also talking just about call depth, not about same-address-space RSB entries being harmless. That's because this is also trying to avoid having the kernel consume any user-placed RSB entries, regardless of whether they're from the same mm or not. >> user=>kernel, same mm >> kernel=>user, same mm > > user-kernel is protected with SMEP. Also, we don't call > FILL_RETURN_BUFFER for these switches? Amit, I'm beginning to fear that you haven't gone and looked at the relevant code here. Please go look at SYM_FUNC_START(__switch_to_asm) in arch/x86/entry/entry_64.S. I believe this code is called for all task switches, including switching from a user task to a kernel task. I also believe that FILL_RETURN_BUFFER is used unconditionally for every __switch_to_asm call (when X86_FEATURE_RSB_CTXSW is on of course). Could we please start over on this patch? Let's get the ERAPS+TLB-flush nonsense out of the kernel and get the commit message right. Then let's go from there.