On Wed, May 15, 2024 at 12:00:49PM +0100, Suzuki K Poulose wrote: > On 15/05/2024 10:01, Catalin Marinas wrote: > > On Fri, Apr 12, 2024 at 09:42:09AM +0100, Steven Price wrote: > > > From: Suzuki K Poulose <suzuki.poulose@xxxxxxx> > > > > > > Device mappings (currently) need to be emulated by the VMM so must be > > > mapped shared with the host. > > > > You say "currently". What's the plan when the device is not emulated? > > How would the guest distinguish what's emulated and what's not to avoid > > setting the PROT_NS_SHARED bit? > > Arm CCA plans to add support for passing through real devices, > which support PCI-TDISP protocol. This would involve the Realm > authenticating the device and explicitly requesting "protected" > mapping *after* the verification (with the help of RMM). I'd have to do some reading, no clue how this works. > > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > > > index f5376bd567a1..db71c564ec21 100644 > > > --- a/arch/arm64/include/asm/pgtable.h > > > +++ b/arch/arm64/include/asm/pgtable.h > > > @@ -598,7 +598,7 @@ static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, > > > #define pgprot_writecombine(prot) \ > > > __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN) > > > #define pgprot_device(prot) \ > > > - __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN) > > > + __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN | PROT_NS_SHARED) > > > > This pgprot_device() is not the only one used to map device resources. > > pgprot_writecombine() is another commonly macro. It feels like a hack to > > plug one but not the other and without any way for the guest to figure > > out what's emulated. > > Agree. I have been exploring hooking this into ioremap_prot() where we > could apply the attribute accordingly. We will change it in the next > version. pgprot_* at least has the advantage that it covers other places. ioremap_prot() would handle the kernel mappings but you have devices mapped in user-space via remap_pfn_range() for example. The protection bits may come from dma_pgprot() with either write-combine or cacheable attributes. One may map device I/O as well (not sure what DPDK does). We could restrict those to protected devices but we need to go through the use-cases. All this needs some thinking, especially if at some point we'll have protected devices. Just hijacking the low-level pgprot macros doesn't feel like a great approach. > > Can the DT actually place those emulated ranges in the higher IPA space > > so that we avoid randomly adding this attribute for devices? > > It can, but then we kind of break the "Realm" view of the IPA space. i.e., > right now it only knows about the "lower IPA" half and uses the top bit as a > protection attr to access the IPA as shared. > > Expanding IPA size view kind of breaks "sharing memory", where, we > must "use a different PA" for a page that is now shared. True, I did not realise that the IPA split is transparent to the host. An option would be additional DT/ACPI attributes for those devices. That's not great either though as we can't handle those attributes in the arch code only and probably we don't want to change generic drivers. Yet another option would be to query the RMM somehow. -- Catalin