On Wed, 20 Sep 2023 17:47:35 +0100, Rob Herring <robh@xxxxxxxxxx> wrote: > > On Tue, Sep 19, 2023 at 7:50 AM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > On Tue, 19 Sep 2023 13:29:07 +0100, > > Rob Herring <robh@xxxxxxxxxx> wrote: > > > > > > On Mon, Sep 18, 2023 at 5:18 AM Marc Zyngier <maz@xxxxxxxxxxxxxxx> wrote: > > > > > > > > On 2023-09-18 11:01, Will Deacon wrote: > > > > > On Tue, Sep 12, 2023 at 07:11:15AM -0500, Rob Herring wrote: > > > > >> Implement the workaround for ARM Cortex-A520 erratum 2966298. On an > > > > >> affected Cortex-A520 core, a speculatively executed unprivileged load > > > > >> might leak data from a privileged level via a cache side channel. > > > > >> > > > > >> The workaround is to execute a TLBI before returning to EL0. A > > > > >> non-shareable TLBI to any address is sufficient. > > > > > > > > > > Can you elaborate at all on how this works, please? A TLBI addressing a > > > > > cache side channel feels weird (or is "cache" referring to some TLB > > > > > structures rather than e.g. the data cache here?). > > > > > > > > > > Assuming there's some vulnerable window between the speculative > > > > > unprivileged load and the completion of the TLBI, what prevents another > > > > > CPU from observing the side-channel during that time? Also, does the > > > > > TLBI need to be using the same ASID as the unprivileged load? If so, > > > > > then > > > > > a context-switch could widen the vulnerable window quite significantly. > > > > > > > > Another 'interesting' case is the KVM world switch. If EL0 is > > > > affected, what about EL1? Can such a data leak exist cross-EL1, > > > > or from EL2 to El1? Asking for a friend... > > > > > > I'm checking for a definitive answer, but page table isolation also > > > avoids the issue. Wouldn't these scenarios all be similar to page > > > table isolation in that the EL2 or prior EL1 context is unmapped? > > > > No, EL2 is always mapped, and we don't have anything like KPTI there. > > > > Maybe the saving grace is that EL2 and EL2&0 are different translation > > regimes from EL1&0, but there's nothing in the commit message that > > indicates it. As for EL1-to-EL1 leaks, it again completely depends on > > how the TLBs are tagged. > > Different translation regimes are not affected. It must be the same > regime and same translation. It would be good to capture this, then. > > > You'd hope that having different VMIDs would save the bacon, but if > > you can leak EL1 translations into EL0, it means that the associated > > permission and/or tags do not contain all the required information... > > The VMID is part of the equation. See here[1]. I have a pretty good idea of how TLB are *supposed* to behave. The fact that you need some sort of invalidation on ERET to EL0 is the proof that this CPU doesn't follow these rules to the letter... M. -- Without deviation from the norm, progress is not possible.