On Thu, Feb 29, 2024 at 08:56:39AM -0700, Alex Williamson wrote: > On Wed, 28 Feb 2024 19:48:01 +0000 > <ankita@xxxxxxxxxx> wrote: > > > From: Ankit Agrawal <ankita@xxxxxxxxxx> > > > > The NVIDIA Grace Hopper GPUs have device memory that is supposed to be > > used as a regular RAM. It is accessible through CPU-GPU chip-to-chip > > cache coherent interconnect and is present in the system physical > > address space. The device memory is split into two regions - termed > > as usemem and resmem - in the system physical address space, > > with each region mapped and exposed to the VM as a separate fake > > device BAR [1]. > > > > Owing to a hardware defect for Multi-Instance GPU (MIG) feature [2], > > there is a requirement - as a workaround - for the resmem BAR to > > display uncached memory characteristics. Based on [3], on system with > > FWB enabled such as Grace Hopper, the requisite properties > > (uncached, unaligned access) can be achieved through a VM mapping (S1) > > of NORMAL_NC and host mapping (S2) of MT_S2_FWB_NORMAL_NC. > > > > KVM currently maps the MMIO region in S2 as MT_S2_FWB_DEVICE_nGnRE by > > default. The fake device BARs thus displays DEVICE_nGnRE behavior in the > > VM. > > > > The following table summarizes the behavior for the various S1 and S2 > > mapping combinations for systems with FWB enabled [3]. > > S1 | S2 | Result > > NORMAL_WB | NORMAL_NC | NORMAL_NC > > NORMAL_WT | NORMAL_NC | NORMAL_NC > > NORMAL_NC | NORMAL_NC | NORMAL_NC > > NORMAL_WB | DEVICE_nGnRE | DEVICE_nGnRE > > NORMAL_WT | DEVICE_nGnRE | DEVICE_nGnRE > > NORMAL_NC | DEVICE_nGnRE | DEVICE_nGnRE > > > > Recently a change was added that modifies this default behavior and > > make KVM map MMIO as MT_S2_FWB_NORMAL_NC when a VMA flag > > VM_ALLOW_ANY_UNCACHED is set. Setting S2 as MT_S2_FWB_NORMAL_NC > > provides the desired behavior (uncached, unaligned access) for resmem. > > > > Such setting is extended to the usemem as a middle-of-the-road > > setting to take it closer to the desired final system memory > > characteristics (cached, unaligned). This will eventually be > > fixed with the ongoing proposal [4]. > > > > To use VM_ALLOW_ANY_UNCACHED flag, the platform must guarantee that > > no action taken on the MMIO mapping can trigger an uncontained > > failure. The Grace Hopper satisfies this requirement. So set > > the VM_ALLOW_ANY_UNCACHED flag in the VMA. > > > > Applied over next-20240227. > > base-commit: 22ba90670a51 > > > > Link: https://lore.kernel.org/all/20240220115055.23546-4-ankita@xxxxxxxxxx/ [1] > > Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu/ [2] > > Link: https://developer.arm.com/documentation/ddi0487/latest/ section D8.5.5 [3] > > Link: https://lore.kernel.org/all/20230907181459.18145-2-ankita@xxxxxxxxxx/ [4] > > > > Cc: Alex Williamson <alex.williamson@xxxxxxxxxx> > > Cc: Kevin Tian <kevin.tian@xxxxxxxxx> > > Cc: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Cc: Vikram Sethi <vsethi@xxxxxxxxxx> > > Cc: Zhi Wang <zhiw@xxxxxxxxxx> > > Signed-off-by: Ankit Agrawal <ankita@xxxxxxxxxx> > > --- > > drivers/vfio/pci/nvgrace-gpu/main.c | 18 ++++++++++++++++++ > > 1 file changed, 18 insertions(+) > > > > diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace-gpu/main.c > > index 25814006352d..5539c9057212 100644 > > --- a/drivers/vfio/pci/nvgrace-gpu/main.c > > +++ b/drivers/vfio/pci/nvgrace-gpu/main.c > > @@ -181,6 +181,24 @@ static int nvgrace_gpu_mmap(struct vfio_device *core_vdev, > > > > vma->vm_pgoff = start_pfn; > > > > + /* > > + * The VM_ALLOW_ANY_UNCACHED VMA flag is implemented for ARM64, > > + * allowing KVM stage 2 device mapping attributes to use Normal-NC > > + * rather than DEVICE_nGnRE, which allows guest mappings > > + * supporting write-combining attributes (WC). This also > > + * unlocks memory-like operations such as unaligned accesses. > > + * This setting suits the fake BARs as they are expected to > > + * demonstrate such properties within the guest. > > + * > > + * ARM does not architecturally guarantee this is safe, and indeed > > + * some MMIO regions like the GICv2 VCPU interface can trigger > > + * uncontained faults if Normal-NC is used. The nvgrace-gpu > > + * however is safe in that the platform guarantees that no > > + * action taken on the MMIO mapping can trigger an uncontained > > + * failure. Hence VM_ALLOW_ANY_UNCACHED is set in the VMA flags. > > + */ > > + vm_flags_set(vma, VM_ALLOW_ANY_UNCACHED); > > + > > return 0; > > } > > > > The commit log sort of covers it, but this comment doesn't seem to > cover why we're setting an uncached attribute to the usemem region > which we're specifically mapping as coherent... did we end up giving > this flag a really poor name if it's being used here to allow unaligned > access? Thanks, Yeah, I sugged to fold that hunk into this: if (index == RESMEM_REGION_INDEX) vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); So it makes more sense. VM_ALLOW_ANY_UNCACHED shouldn't be used on the cachable mapping. The comment should be more specific to this driver and not so generic: /* * nvgrace has no issue with uncontained failures on NORMAL_NC * access. Tell KVM to open up guest usage of NORMAL_NC for this mapping. */ Jason