On Tue, Mar 18, 2025 at 12:30:43PM -0700, Oliver Upton wrote: > On Tue, Mar 18, 2025 at 09:55:27AM -0300, Jason Gunthorpe wrote: > > On Tue, Mar 18, 2025 at 09:39:30AM +0000, Marc Zyngier wrote: > > > > > The memslot must also be created with a new flag ((2c) in the taxonomy > > > above) that carries the "Please map VM_PFNMAP VMAs as cacheable". This > > > flag is only allowed if (1) is valid. > > > > > > This results in the following behaviours: > > > > > > - If the VMM creates the memslot with the cacheable attribute without > > > (1) being advertised, we fail. > > > > > > - If the VMM creates the memslot without the cacheable attribute, we > > > map as NC, as it is today. > > > > Is that OK though? > > > > Now we have the MM page tables mapping this memory as cachable but KVM > > and the guest is accessing it as non-cached. > > > > I thought ARM tried hard to avoid creating such mismatches? This is > > why the pgprot flags were used to drive this, not an opt-in flag. To > > prevent userspace from forcing a mismatch. > > It's far more problematic the other way around, e.g. the host knows that > something needs a Device-* attribute and the VM has done something > cacheable. The endpoint for that PA could, for example, fall over when > lines pulled in by the guest are written back, which of course can't > always be traced back to the offending VM. > > OTOH, if the host knows that a PA is cacheable and the guest does > something non-cacheable, you 'just' have to deal with the usual > mismatched attributes problem as laid out in the ARM ARM. I think the issue is that KVM doesn't do that usual stuff (ie cache flushing) for memory that doesn't have a struct page backing. So nothing in the hypervisor does any cache flushing and I belive you end up with a situation where the VMM could have zero'd this cachable memory using cachable stores to sanitize it across VMs and then KVM can put that memory into the VM as uncached and the VM could then access stale non-zeroed data from a prior VM. Yes? This is a security problem. As I understand things KVM must either do the cache flushing, or must not allow mismatched attributes, as a matter of security. This is why FWB comes into the picture because KVM cannot do the cache flushing of PFNMAP VMAs. So we force the MM and KVM S2 to be cachable and use S2 FWB to prevent the guest from ever making it uncachable. Thus the cache flushes are not required and everything is security safe. So I think the logic we want here in the fault handler is to: Get the mm's PTE If it is cachable: Check if it has a struct page: Yes - KVM flushes it and can use a non-FWB path No - KVM either fails to install it, or installs it using FWB to force cachability. KVM never allows degrading cachable to non-cachable when it can't do flushing. Not cachable: Install it with Normal-NC as was previously discussed and merged > Userspace should be stating intentions on the memslot with the sort of > mapping that it wants to create, and a memslot flag to say "I allow > cacheable mappings" seems to fit the bill. I'm not sure about this, I don't see that the userspace has any choice. As above, KVM has to follow whatever is in the PTEs, the userspace can't ask for something different here. At best you could make non-struct page cachable memory always fail unless the flag is given - but why? It seems sufficient for fast fail to check if the VMA has PFNMAP and pgprot cachable then !FEAT_FWB fails the memslot. There is no real recovery from this, the VMM is doing something that cannot be supported. > - Stage-2 faults serviced w/ a non-cacheable mapping if flag is not > set As above, I think this creates a bug :\ Jason