From: Ankit Agrawal <ankita@xxxxxxxxxx> The NVIDIA Grace Hopper GPUs have device memory that is supposed to be used as a regular RAM. It is accessible through CPU-GPU chip-to-chip cache coherent interconnect and is present in the system physical address space. The device memory is split into two regions - termed as usemem and resmem - in the system physical address space, with each region mapped and exposed to the VM as a separate fake device BAR [1]. Owing to a hardware defect for Multi-Instance GPU (MIG) feature [2], there is a requirement - as a workaround - for the resmem BAR to display uncached memory characteristics. Based on [3], on system with FWB enabled such as Grace Hopper, the requisite properties (uncached, unaligned access) can be achieved through a VM mapping (S1) of NORMAL_NC and host mapping (S2) of MT_S2_FWB_NORMAL_NC. KVM currently maps the MMIO region in S2 as MT_S2_FWB_DEVICE_nGnRE by default. The fake device BARs thus displays DEVICE_nGnRE behavior in the VM. The following table summarizes the behavior for the various S1 and S2 mapping combinations for systems with FWB enabled [3]. S1 | S2 | Result NORMAL_WB | NORMAL_NC | NORMAL_NC NORMAL_WT | NORMAL_NC | NORMAL_NC NORMAL_NC | NORMAL_NC | NORMAL_NC NORMAL_WB | DEVICE_nGnRE | DEVICE_nGnRE NORMAL_WT | DEVICE_nGnRE | DEVICE_nGnRE NORMAL_NC | DEVICE_nGnRE | DEVICE_nGnRE Recently a change was added that modifies this default behavior and make KVM map MMIO as MT_S2_FWB_NORMAL_NC when a VMA flag VM_ALLOW_ANY_UNCACHED is set. Setting S2 as MT_S2_FWB_NORMAL_NC provides the desired behavior (uncached, unaligned access) for resmem. Such setting is extended to the usemem as a middle-of-the-road setting to take it closer to the desired final system memory characteristics (cached, unaligned). This will eventually be fixed with the ongoing proposal [4]. To use VM_ALLOW_ANY_UNCACHED flag, the platform must guarantee that no action taken on the MMIO mapping can trigger an uncontained failure. The Grace Hopper satisfies this requirement. So set the VM_ALLOW_ANY_UNCACHED flag in the VMA. Applied over next-20240227. base-commit: 22ba90670a51 Link: https://lore.kernel.org/all/20240220115055.23546-4-ankita@xxxxxxxxxx/ [1] Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu/ [2] Link: https://developer.arm.com/documentation/ddi0487/latest/ section D8.5.5 [3] Link: https://lore.kernel.org/all/20230907181459.18145-2-ankita@xxxxxxxxxx/ [4] Cc: Alex Williamson <alex.williamson@xxxxxxxxxx> Cc: Kevin Tian <kevin.tian@xxxxxxxxx> Cc: Jason Gunthorpe <jgg@xxxxxxxxxx> Cc: Vikram Sethi <vsethi@xxxxxxxxxx> Cc: Zhi Wang <zhiw@xxxxxxxxxx> Signed-off-by: Ankit Agrawal <ankita@xxxxxxxxxx> --- drivers/vfio/pci/nvgrace-gpu/main.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c index 25814006352d..5539c9057212 100644 --- a/drivers/vfio/pci/nvgrace-gpu/main.c +++ b/drivers/vfio/pci/nvgrace-gpu/main.c @@ -181,6 +181,24 @@ static int nvgrace_gpu_mmap(struct vfio_device *core_vdev, vma->vm_pgoff = start_pfn; + /* + * The VM_ALLOW_ANY_UNCACHED VMA flag is implemented for ARM64, + * allowing KVM stage 2 device mapping attributes to use Normal-NC + * rather than DEVICE_nGnRE, which allows guest mappings + * supporting write-combining attributes (WC). This also + * unlocks memory-like operations such as unaligned accesses. + * This setting suits the fake BARs as they are expected to + * demonstrate such properties within the guest. + * + * ARM does not architecturally guarantee this is safe, and indeed + * some MMIO regions like the GICv2 VCPU interface can trigger + * uncontained faults if Normal-NC is used. The nvgrace-gpu + * however is safe in that the platform guarantees that no + * action taken on the MMIO mapping can trigger an uncontained + * failure. Hence VM_ALLOW_ANY_UNCACHED is set in the VMA flags. + */ + vm_flags_set(vma, VM_ALLOW_ANY_UNCACHED); + return 0; } -- 2.34.1