On Wed, 5 Jul 2023 18:37:42 +0000 Ankit Agrawal <ankita@xxxxxxxxxx> wrote: > > I had also asked in the previous review whether "nvgpu" is already overused. I > > see a python tool named nvgpu, an OpenXLA tool, various nvgpu things related > > to Tegra, an nvgpu dialect for MLIR, etc. There are over 5,000 hits on google for > > "nvgpu", only a few of which reference development of this module. Is there a > > more unique name we can use? Thanks, > > Sorry, had missed this comment. Are you suggesting changing the module name > or just reduce the number of times we use the nvgpu keyword in all the functions > of the module? I don't see any in-tree or vfio-pci module with a similar *nvgpu* > name, and the clash appears to be with items outside of the kernel tree. Given > that, should we still change the module name as nvgpu-vfio-pci sounds a relevant > name here? Thanks. I'm referring to the module name, which in turn would be reflected in various function names. The fact that there's no in-tree *nvgpu* driver seems irrelevant when a web search for the term shows a variety of tools and drivers, I believe there's even an out-of-tree NVIDIA sponsored nvgpu driver for Android, correct? How does this relate to that? I don't think it does, so why generate confusion? I don't know your future plans for this driver, but it's currently limited to exposing essentially a single feature on a very, very small product subset, while "nvgpu" seems to project something much more generic. If we're going to see more of devices exposing coherent memory with CXL, does that mean this driver might be short lived and perhaps won't see further expansion in functionality? If so maybe it should be named more specifically for the product it supports. I see some NVIDIA pages referring to the GH200 superchip, maybe "GH", ex. "nvgh", "nvgh-gpu"? Reading through the datasheet, I'm also reminded of issues we had with the POWER implementation relative to isolation, since this coherent memory is enabled via NVLink-C2C, which is opaque to Linux. The datasheet claims "[f]ourth-generation NVLink allows accessing peer memory with direct loads, sotres, and atomic operations...", are those direct accesses reflected in the PCI topology, ie. the PCIe ACS exposed isolation, or is the peer here limited to the CPU? Thanks, Alex