Hi, On Fri, May 06, 2022 at 09:55:54PM +0100, Nikos Nikoleris wrote: > From: Andrew Jones <drjones@xxxxxxxxxx> > > The commit message of commit 410b3bf09e76 ("arm/arm64: Perform dcache > clean + invalidate after turning MMU off") justifies cleaning and > invalidating the dcache after disabling the MMU by saying it's nice > not to rely on the current page tables and that it should still work > (per the spec), as long as there's an identity map in the current > tables. Doing the invalidation after also somewhat helped with That's not what the commit says (well, that's now what I wanted to say in the commit, it might be that I haven't been clear enough): "Data caches are PIPT and the VAs are translated using the current translation tables, or an identity mapping (what Arm calls a "flat mapping") when the MMU is off". That "flat mapping" does not rely on the TTBRx_EL1 tables, it means that the output address (the physical address) is the same as the input address (the virtual address). No actual translation is taking place. > reenabling the MMU without seeing stale data, but the real problem > with reenabling was because the cache needs to be disabled with > the MMU, but it wasn't. That's not what ARM DDI 0487H.a says on page D5-4826 when HCR_EL2.DC == 0 (which is how KVM configures HCR_EL2): "For all other accesses, when stage 1 address translation is disabled, the assigned attributes depend on whether the access is a data access or an instruction access, as follows: Data access The stage 1 translation assigns the Device-nGnRnE memory type." When the MMU is off, data accesses are non-cacheable. > > Since we have to trust/validate that the current page tables have an > identity map anyway, then there's no harm in doing the clean > and invalidate first (it feels a little better to do so, anyway, > considering the cache maintenance instructions take virtual > addresses). Then, also disable the cache with the MMU to avoid That's questionable, CPU can speculate reads which allocate a new dcache entry after clean + invalidate and before the MMU is turned off, thus making the clean + invalidate rather useless. > problems when reenabling. We invalidate the Icache and disable ARM DDI 0487H.a is pretty clear when icache maintainance is required on page D5-4933: "Any permitted instruction cache implementation can be described as implementing the IVIPT Extension to the Arm architecture. The formal definition of the Arm IVIPT Extension is that it reduces the instruction cache maintenance requirement to the following condition: - Instruction cache maintenance is required only after writing new data to a PA that holds an instruction." If you are seeing issues that are solved by doing an icache invalidation, I would look first at what the EFI spec guarantees regarding icache maintenance, because kvm-unit-tests doesn't modify its instructions. > that too for good measure. And, a final TLB invalidation ensures > we're crystal clean when we return from asm_mmu_disable(). If I were to guess, any issues that you are seeing are caused by the fact that EFI apps start with the MMU enabled, and kvm-unit-tests so far has assumed that the tests start with the MMU disabled. Thanks, Alex > > Cc: Alexandru Elisei <alexandru.elisei@xxxxxxx> > Signed-off-by: Andrew Jones <drjones@xxxxxxxxxx> > Signed-off-by: Nikos Nikoleris <nikos.nikoleris@xxxxxxx> > --- > arm/cstart.S | 28 +++++++++++++++++++++------- > arm/cstart64.S | 21 ++++++++++++++++----- > 2 files changed, 37 insertions(+), 12 deletions(-) > > diff --git a/arm/cstart.S b/arm/cstart.S > index 7036e67..dc324c5 100644 > --- a/arm/cstart.S > +++ b/arm/cstart.S > @@ -179,6 +179,7 @@ halt: > .globl asm_mmu_enable > asm_mmu_enable: > /* TLBIALL */ > + mov r2, #0 > mcr p15, 0, r2, c8, c7, 0 > dsb nsh > > @@ -211,12 +212,7 @@ asm_mmu_enable: > > .globl asm_mmu_disable > asm_mmu_disable: > - /* SCTLR */ > - mrc p15, 0, r0, c1, c0, 0 > - bic r0, #CR_M > - mcr p15, 0, r0, c1, c0, 0 > - isb > - > + /* Clean + invalidate the entire memory */ > ldr r0, =__phys_offset > ldr r0, [r0] > ldr r1, =__phys_end > @@ -224,7 +220,25 @@ asm_mmu_disable: > sub r1, r1, r0 > dcache_by_line_op dccimvac, sy, r0, r1, r2, r3 > > - mov pc, lr > + /* Invalidate Icache */ > + mov r0, #0 > + mcr p15, 0, r0, c7, c5, 0 > + isb > + > + /* Disable cache, Icache and MMU */ > + mrc p15, 0, r0, c1, c0, 0 > + bic r0, #CR_C > + bic r0, #CR_I > + bic r0, #CR_M > + mcr p15, 0, r0, c1, c0, 0 > + isb > + > + /* Invalidate TLB */ > + mov r0, #0 > + mcr p15, 0, r0, c8, c7, 0 > + dsb nsh > + > + mov pc, lr > > /* > * Vectors > diff --git a/arm/cstart64.S b/arm/cstart64.S > index e4ab7d0..390feb9 100644 > --- a/arm/cstart64.S > +++ b/arm/cstart64.S > @@ -246,11 +246,6 @@ asm_mmu_enable: > > .globl asm_mmu_disable > asm_mmu_disable: > - mrs x0, sctlr_el1 > - bic x0, x0, SCTLR_EL1_M > - msr sctlr_el1, x0 > - isb > - > /* Clean + invalidate the entire memory */ > adrp x0, __phys_offset > ldr x0, [x0, :lo12:__phys_offset] > @@ -259,6 +254,22 @@ asm_mmu_disable: > sub x1, x1, x0 > dcache_by_line_op civac, sy, x0, x1, x2, x3 > > + /* Invalidate Icache */ > + ic iallu > + isb > + > + /* Disable cache, Icache and MMU */ > + mrs x0, sctlr_el1 > + bic x0, x0, SCTLR_EL1_C > + bic x0, x0, SCTLR_EL1_I > + bic x0, x0, SCTLR_EL1_M > + msr sctlr_el1, x0 > + isb > + > + /* Invalidate TLB */ > + tlbi vmalle1 > + dsb nsh > + > ret > > /* > -- > 2.25.1 >