Re: [PATCH 1/2] ARM: kvm: define PAGE_S2_DEVICE as read-only by default

Mario Smarduch <m.smarduch@xxxxxxxxxxx> · Mon, 15 Sep 2014 12:41:51 -0700

I've been working around the edges of this discussion, and
maybe be little unclear on this.

But the manuals say intersection of Stage1/Stage2 permissions are
used. If guest sets stage1 to read-only then setting stage 2
to read-only or read-write should have no impact. So why
should stage 2 RW be changed?

- Mario

On 09/14/2014 03:57 PM, Ard Biesheuvel wrote:
> On 14 September 2014 11:43, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote:
>> On 14 September 2014 11:09, Marc Zyngier <maz@xxxxxxxxxxxxxxx> wrote:
>>> On 2014-09-14 05:49, Ard Biesheuvel wrote:
>>>>
>>>> On 13 September 2014 19:06, Christoffer Dall
>>>> <christoffer.dall@xxxxxxxxxx> wrote:
>>>>>
>>>>> On Sat, Sep 13, 2014 at 01:15:45PM +0200, Ard Biesheuvel wrote:
>>>>>>
>>>>>> On 13 September 2014 12:41, Marc Zyngier <marc.zyngier@xxxxxxx> wrote:
>>>>>>> Hi Ard,
>>>>>>>
>>>>>>> On 2014-09-13 11:17, Ard Biesheuvel wrote:
>>>>>>>>
>>>>>>>> Now that we support read-only memslots, we need to make sure that
>>>>>>>> pass-through device mappings are not mapped writable if the guest
>>>>>>>> has requested them to be read-only. The existing implementation
>>>>>>>> already honours this by calling kvm_set_s2pte_writable() on the new
>>>>>>>> pte in case of writable mappings, so all we need to do is define
>>>>>>>> the default pgprot_t value used for devices to be PTE_S2_RDONLY.
>>>>>>>>
>>>>>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
>>>>>>>
>>>>>>>
>>>>>>> I feel very uncomfortable with this change. Why would we map a device
>>>>>>> RO? Is
>>>>>>> that only for completeness sake?
>>>>>>>
>>>>>>
>>>>>> We would map a device RO so that QEMU (or whatever is managing KVM)
>>>>>> can emulate the writes. I don't have a clear cut use case, to be
>>>>>> honest, but setting up a writable mapping for a memslot that was
>>>>>> explicitly set up as read-only seems wrong in any case.
>>>>>
>>>>>
>>>>> Agreed, if it doesn't ever make sense to do so, then we should return an
>>>>> error to user space if userspace attempts such a configuration.  The
>>>>> current code is just weird.
>>>>>
>>>>>>
>>>>>> Note that the particular problem I was seeing was primarily caused by
>>>>>> kvm_is_mmio_pfn()'s false positive on the zero page, but it unveiled
>>>>>> this particular issue as well.
>>>>>>
>>>>>>> Note that we also use PAGE_S2_DEVICE for things that are not mapped
>>>>>>> through
>>>>>>> a memslot, such as the GIC.
>>>>>>>
>>>>>>
>>>>>> Yes, and I realize now that this breaks it.
>>>>>> My apologies: I have an additional patch locally that sets up MMIO
>>>>>> ranges in one go instead of faulting them in one page at a time as we
>>>>>> do now, and there the read-write case is handled correctly in
>>>>>> kvm_phys_addr_ioremap(). However, I thought it was better to send
>>>>>> these out separately first, but apparently not.
>>>>>
>>>>>
>>>>> I think it is better to change this separately, and then add the ioremap
>>>>> stuff.  However, you need to change all places that call PAGE_S2_DEVICE
>>>>> and expect a RDWR memory region, this happens to be only
>>>>> kvm_phys_addr_ioremap() for now.
>>>>>
>>>>>>
>>>>>> So if we can agree on whether or not MMIO backed mappings should be
>>>>>> read-write even if the memslot says no, I will follow up with a proper
>>>>>> series if there are still changes required.
>>>>>>
>>>>>
>>>>> I guess it could be theoretically useful to have read-only device memory
>>>>> regions, and I can't think of why it would violate the architecture.
>>>>>
>>>>
>>>> We have to handle it either way. But after reading D4.5.3 (Table
>>>> D4-40) of the ARM ARM, I am wondering why we needed patch b88657674d39
>>>> "ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping" in the
>>>> first place, and we could just revert that patch and everything would
>>>> work as expected. (In short, D4.5.3 says that MT_DEVICE trumps
>>>> MT_NORMAL, so if the stage 1 translation is MT_DEVICE, it doesn't
>>>> matter what memtype the S2 translation specifies)
>>>
>>>
>>> We've been there before:
>>> https://lists.cs.columbia.edu/pipermail/kvmarm/2013-May/004420.html
>>>
>>
>> Ah right. So why did those patches not make it in?
>>
> 
> Never mind. I read the whole thread this time.
> 
> So, in summary, there is a concern that a malicious guest may request
> a cachable mapping for a device range, in an attempt to manipulate the
> VGIC or other device memory of another VM.
> I think that concern only applies to writable mappings, so perhaps we
> should just change
> 
> if (kvm_is_mmio_pfn(pfn))
> 
> to
> 
> if (kvm_is_mmio_pfn(pfn) && writable)
> 
> and be done with it (which is coincidentally the very first naive fix
> I suggested for the issue i was seeing)
> That way, we never map read-only MMIO regions writable, and rely on
> the MT_DEVICE trumps MT_NORMAL rule to ensure the guest reads to those
> regions are uncached.
> (Wouldn't hurt to add a comment to explain it, I suppose)
> 

_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm