On Wed, 05 Apr 2023 19:01:28 +0100, <ankita@xxxxxxxxxx> wrote: > > From: Ankit Agrawal <ankita@xxxxxxxxxx> > > NVIDIA's upcoming Grace Hopper Superchip provides a PCI-like device > for the on-chip GPU that is the logical OS representation of the > internal propritary cache coherent interconnect. > > This representation has a number of limitations compared to a real PCI > device, in particular, it does not model the coherent GPU memory > aperture as a PCI config space BAR, and PCI doesn't know anything > about cacheable memory types. > > Provide a VFIO PCI variant driver that adapts the unique PCI > representation into a more standard PCI representation facing > userspace. The GPU memory aperture is obtained from ACPI, according to > the FW specification, and exported to userspace as the VFIO_REGION > that covers the first PCI BAR. qemu will naturally generate a PCI > device in the VM where the cacheable aperture is reported in BAR1. > > Since this memory region is actually cache coherent with the CPU, the > VFIO variant driver will mmap it into VMA using a cacheable mapping. > > As this is the first time an ARM environment has placed cacheable > non-struct page backed memory (eg from remap_pfn_range) into a KVM > page table, fix a bug in ARM KVM where it does not copy the cacheable > memory attributes from non-struct page backed PTEs to ensure the guest > also gets a cacheable mapping. This is not a bug, but a conscious design decision. As you pointed out above, nothing needed this until now, and a device mapping is the only safe thing to do as we know exactly *nothing* about the memory that gets mapped. M. -- Without deviation from the norm, progress is not possible.