On 10/02/25 23:08, Jann Horn wrote: > On Mon, Feb 10, 2025 at 7:36 PM Valentin Schneider <vschneid@xxxxxxxxxx> wrote: >> What if isolated CPUs unconditionally did a TLBi as late as possible in >> the stack right before returning to userspace? This would mean that upon >> re-entering the kernel, an isolated CPU's TLB wouldn't contain any kernel >> range translation - with the exception of whatever lies between the >> last-minute flush and the actual userspace entry, which should be feasible >> to vet? Then AFAICT there wouldn't be any work/flush to defer, the IPI >> could be entirely silenced if it targets an isolated CPU. > > Two issues with that: > Firstly, thank you for entertaining the idea :-) > 1. I think the "Common not Private" feature Will Deacon referred to is > incompatible with this idea: > <https://developer.arm.com/documentation/101811/0104/Address-spaces/Common-not-Private> > says "When the CnP bit is set, the software promises to use the ASIDs > and VMIDs in the same way on all processors, which allows the TLB > entries that are created by one processor to be used by another" > Sorry for being obtuse - I can understand inconsistent TLB states (old vs new translations being present in separate TLBs) due to not sending the flush IPI causing an issue with that, but not "flushing early". Even if TLB entries can be shared/accessed between CPUs, a CPU should be allowed not to have a shared entry in its TLB - what am I missing? > 2. It's wrong to assume that TLB entries are only populated for > addresses you access - thanks to speculative execution, you have to > assume that the CPU might be populating random TLB entries all over > the place. Gotta love speculation. Now it is supposed to be limited to genuinely accessible data & code, right? Say theoretically we have a full TLBi as literally the last thing before doing the return-to-userspace, speculation should be limited to executing maybe bits of the return-from-userspace code? Furthermore, I would hope that once a CPU is executing in userspace, it's not going to populate the TLB with kernel address translations - AIUI the whole vulnerability mitigation debacle was about preventing this sort of thing.