On Wed, Apr 12, 2023 at 01:28:08PM +0100, Marc Zyngier wrote: > On Wed, 05 Apr 2023 19:01:28 +0100, > <ankita@xxxxxxxxxx> wrote: > > > > From: Ankit Agrawal <ankita@xxxxxxxxxx> > > > > NVIDIA's upcoming Grace Hopper Superchip provides a PCI-like device > > for the on-chip GPU that is the logical OS representation of the > > internal propritary cache coherent interconnect. > > > > This representation has a number of limitations compared to a real PCI > > device, in particular, it does not model the coherent GPU memory > > aperture as a PCI config space BAR, and PCI doesn't know anything > > about cacheable memory types. > > > > Provide a VFIO PCI variant driver that adapts the unique PCI > > representation into a more standard PCI representation facing > > userspace. The GPU memory aperture is obtained from ACPI, according to > > the FW specification, and exported to userspace as the VFIO_REGION > > that covers the first PCI BAR. qemu will naturally generate a PCI > > device in the VM where the cacheable aperture is reported in BAR1. > > > > Since this memory region is actually cache coherent with the CPU, the > > VFIO variant driver will mmap it into VMA using a cacheable mapping. > > > > As this is the first time an ARM environment has placed cacheable > > non-struct page backed memory (eg from remap_pfn_range) into a KVM > > page table, fix a bug in ARM KVM where it does not copy the cacheable > > memory attributes from non-struct page backed PTEs to ensure the guest > > also gets a cacheable mapping. > > This is not a bug, but a conscious design decision. As you pointed out > above, nothing needed this until now, and a device mapping is the only > safe thing to do as we know exactly *nothing* about the memory that > gets mapped. IMHO, from the mm perspective, the bug is using pfn_is_map_memory() to determine the cachability or device memory status of a PFN in a VMA. That is not what that API is for. The cachability should be determined by the pgprot bits in the VMA. VM_IO is the flag that says the VMA maps memory with side-effects. I understand in ARM KVM it is not allowed for the VM and host to have different cachability, so mis-detecting host cachable memory and making it forced non-cachable in the VM is not a safe thing to do? Jason