Re: [BUG] ARM64 KVM: Data abort executing post-indexed LDR on MMIO address

Marc Zyngier <maz@xxxxxxxxxx> · Sat, 05 Oct 2024 22:35:49 +0100

On Sat, 05 Oct 2024 19:38:23 +0100,
Ahmad Fatoum <a.fatoum@xxxxxxxxxxxxxx> wrote:
> 
> Hello Marc,
> 
> On 05.10.24 12:31, Marc Zyngier wrote:
> > On Fri, 04 Oct 2024 20:50:18 +0100,
> > Ahmad Fatoum <a.fatoum@xxxxxxxxxxxxxx> wrote:
> >> With readl/writel implemented in assembly, I get beyond that point, but
> >> now I get a data abort running an DC IVAC instruction on address 0x1000,
> >> where the cfi-flash is located. This instruction is part of a routine
> >> to remap the cfi-flash to start a page later, so the zero page can be
> >> mapped faulting.
> 
> [snip]
> 
> >> Any idea what this is about?
> > 
> > IIRC, the QEMU flash is implemented as a read-only memslot. A data
> > cache invalidation is a *write*, as it can be (and is) upgraded to a
> > clean+invalidate by the HW.
> 
> So it's a write, even if there are no dirty cache lines?

Yes.

At the point where this is handled, the CPU has no clue about the
dirty state of an arbitrary cache line, at an arbitrary cache level.
The CPU simply forward the CMO downstream, and the permission check
happens way before that.

> > KVM cannot satisfy the write, for obvious reasons, and tells the guest
> > to bugger off (__gfn_to_pfn_memslot() returns KVM_PFN_ERR_RO_FAULT,
> > which satisfies is_error_noslot_pfn() -- a slight oddity, but hey, why
> > not).
> > 
> > In the end, you get an exception. We could relax this by
> > special-casing CMOs to RO memslots, but this doesn't look great.
> > 
> > The real question is: what are you trying to achieve with this?
> 
> barebox sets up the MMU, but tries to keep a 1:1 mapping. On Virt, we
> want to map the zero page faulting, but still have access to the first
> block of the cfi-flash.
> 
> Therefore, barebox will map the cfi-flash one page later
> (virt 0x1000,0x2000,... -> phys 0x0000,0x1000,...) and so on, so the first
> page can be mapped faulting.
> 
> The routine[1] that does this remapping invalidates the virtual address range,
> because the attributes may change[2]. This invalidate also happens for cfi-flash,
> but we should never encounter dirty cache lines there as the remap is done
> before driver probe.
>
> Can you advise what should be done differently?

If you always map the flash as Device memory, there is no need for
CMOs. Same thing if you map it as NC. And even if you did map it as
Cacheable, it wouldn't matter. KVM already handles coherency when the
flash is switching between memory-mapped and programming mode, as the
physical address space changes (the flash literally drops from the
memory map).

In general, issuing CMOs to a device is a bizarre concept, because it
is pretty rare that a device can handle a full cache-line as
write-back. Devices tend to handle smaller, register-sized accesses,
not a full 64-byte eviction.

Now, I'm still wondering whether we should simply forward the CMO to
userspace as we do for other writes, and let the VMM deal with it. The
main issue is that there is no current way to describe this.

The alternative would be to silently handle the trap and pretend it
never occurred, as we do for other bizarre behaviours. But that'd be
something only new kernels would support, and I guess you'd like your
guest to work today, not tomorrow.

	M.

-- 
Without deviation from the norm, progress is not possible.