On Fri, Jan 31, 2025 at 10:56:56AM +0000, Mark Rutland wrote: > On Thu, Jan 30, 2025 at 12:46:15PM -0800, Oliver Upton wrote: > > There are a few places where the idreg overrides are read w/ the MMU > > off, for example the VHE and hVHE checks in __finalise_el2. And while > > the infrastructure gets this _mostly_ right (i.e. does the appropriate > > cache maintenance), the placement of the data itself is problematic and > > could share a cache line with something else. > > > > Depending on how unforgiving an implementation's handling of mismatched > > attributes is, this could lead to data corruption. In one observed case, > > the system_cpucaps shared a line with arm64_sw_feature_override and the > > cpucaps got nuked after entering the hyp stub... > > This doesn't sound right. Non-cacheable/Device reads should not lead to > corruption of a cached copy regardless of whether that cached copy is > clean or dirty. > > The corruption suggests that either we're performing a *write* with > mismatched attributes (in which case the use of .mmuoff.data.read below > isn't quite right), or we have a plan invalidate somewhere without a > clean (and e.g. something else might need to be moved into > .mmuoff.data.write). > > Seconding Ard's point, I think we need to understand this scenario > better. Of course. So the write to the idreg override is fine and gets written back after cache maintenance. What's happening afterwards is CPU0 pulls in the line to write to system_cpucaps which also happens to contain arm64_sw_feature_override. That line is in UD state when CPU0 calls HVC_FINALISE_EL2 and goes to EL2 with the MMU off. __finalise_el2() does a load on arm64_sw_feature_override which goes out as a ReadNoSnp. I is the only cache state for this request (IHI 0050G B4.2.1.2) and the SF stops tracking the line, so writeback never actually goes anywhere. This patch might have been a touch premature, let me double check a few things on our side. -- Thanks, Oliver