Hi Alex,
On 2021-02-12 17:12, Alexandru Elisei wrote:
Hi Marc,
I've been trying to get my head around what the architecture says about
CMOs, so
please bare with me if I misunderstood some things.
No worries. I've had this patch for a few weeks now, and can't
make up my mind about it. It does address an actual issue though,
so I couldn't just discard it... ;-)
On 2/11/21 2:27 PM, Marc Zyngier wrote:
It appears that when a guest traps into KVM because it is
performing a CMO on a Read Only memslot, our handling of
this operation is "slightly suboptimal", as we treat it as
an MMIO access without a valid syndrome.
The chances that userspace is adequately equiped to deal
with such an exception being slim, it would be better to
handle it in the kernel.
What we need to provide is roughly as follows:
(a) if a CMO hits writeable memory, handle it as a normal memory acess
(b) if a CMO hits non-memory, skip it
(c) if a CMO hits R/O memory, that's where things become fun:
(1) if the CMO is DC IVAC, the architecture says this should result
in a permission fault
(2) if the CMO is DC CIVAC, it should work similarly to (a)
When you say it should work similarly to (a), you mean it should be
handled as a
normal memory access, without the "CMO hits writeable memory" part,
right?
What I mean is that the cache invalidation should take place,
preferably without involving KVM at all (other than populating
S2 if required).
We already perform (a) and (b) correctly, but (c) is a total mess.
Hence we need to distinguish between IVAC (c.1) and CIVAC (c.2).
One way to do it is to treat CMOs generating a translation fault as
a *read*, even when they are on a RW memslot. This allows us to
further triage things:
If they come back with a permission fault, that is because this is
a DC IVAC instruction:
- inside a RW memslot: no problem, treat it as a write (a)(c.2)
- inside a RO memslot: inject a data abort in the guest (c.1)
The only drawback is that DC IVAC on a yet unmapped page faults
twice: one for the initial translation fault that result in a RO
mapping, and once for the permission fault. I think we can live with
that.
I'm trying to make sure I understand what the problem is.
gfn_to_pfn_prot() returnsKVM_HVA_ERR_RO_BAD if the write is to a RO
memslot.
KVM_HVA_ERR_RO_BAD is PAGE_OFFSET + PAGE_SIZE, which means that
is_error_noslot_pfn() return true. In that case we exit to userspace
with -EFAULT
for DC IVAC and DC CIVAC. But what we should be doing is this:
- For DC IVAC, inject a dabt with ISS = 0x10, meaning an external abort
(that's
what kvm_inject_dabt_does()).
- For DC CIVAC, exit to userspace with -EFAULT.
Did I get that right?
Not quite. What I *think* we should do is:
- DC CIVAC should just work, without going to userspace. I can't imagine
a reason why we'd involve userspace for this, and we currently don't
really have a good way to describe this to userspace.
- DC IVAC is more nuanced: we could either inject an exception (which
is what this patch does), or perform the CMO ourselves as a DC CIVAC
(consistent with the IVA->CIVA upgrade caused by having a S2
translation).
This second approach is comparable to what we do when the guest
issues a CMO on an emulated MMIO address (we don't inject a fault).
Thanks,
M.
--
Jazz is not dead. It just smells funny...