>> >> NVIDIA's recently introduced Grace Blackwell (GB) Superchip in >> continuation with the Grace Hopper (GH) superchip that provides a >> cache coherent access to CPU and GPU to each other's memory with >> an internal proprietary chip-to-chip (C2C) cache coherent interconnect. >> The in-tree nvgrace-gpu driver manages the GH devices. The intention >> is to extend the support to the new Grace Blackwell boards. > > Where do we stand on QEMU enablement of GH, or the GB support here? > IIRC, the nvgrace-gpu variant driver was initially proposed with QEMU > being the means through which the community could make use of this > driver, but there seem to be a number of pieces missing for that > support. Thanks, > > Alex Hi Alex, the Qemu enablement changes for GH is already in Qemu 9.0. This is the Generic initiator change that got merged: https://lore.kernel.org/all/20240308145525.10886-1-ankita@xxxxxxxxxx/ The missing pieces are actually in the kvm/kernel viz: 1. KVM need to map the device memory as Normal. The KVM patch was proposed here. This patch need refresh to address the suggestions: https://lore.kernel.org/all/20230907181459.18145-2-ankita@xxxxxxxxxx/ 2. ECC handling series for the GPU device memory that is remap_pfn_range() mapped: https://lore.kernel.org/all/20231123003513.24292-1-ankita@xxxxxxxxxx/ With those changes, the GH would be functional with the Qemu 9.0. We discovered a separate Qemu issue while doing verification of Grace Blackwell, where the 512G of highmem proved short here: https://github.com/qemu/qemu/blob/v9.0.0/hw/arm/virt.c#L211 We are planning to have a proposal for the fix floated for that. Thanks Ankit Agrawal