On Wed, 2024-01-31 at 05:18 -0800, Christoph Hellwig wrote: > On Wed, Jan 31, 2024 at 06:34:00AM +0100, Arthur Muller wrote: > > Dear all, > > > > We've encountered a similar issue. In our case, we are using the > > Lustre > > file system instead of NVMe-oF to connect our storage over the > > network. > > Our setup involves an AMD EPYC 7282 machine paired with Mellanox > > MT28908 cards. Following the guidelines in the Nvidia > > documentation: > > > > https://docs.nvidia.com/networking/display/mlnxenv584150lts/installing+mlnx_en#src-2477565014_InstallingMLNX_EN-InstallationModes > > > > we compiled the MLNX_EN 5.8 LTS driver using VMA. Additionally, we > > experimented with the latest MLNX_EN 23.10 driver, encountering the > > same issue. > > If you use the nvidia out of tree junk you are completely on your own > and have no one to blame but yourself. Any problems with that do not > belong on a Linux mailing list. > Dear Christoph, thank you very much for your reply pointing me to the possible cause of the problem. I am not blaming anybody. According to the history of this thread I was assuming that there might be an unsolved IOMMU issue and just provided some context, hoping I can help debugging it. There are some IOMMU-related theads on Kernel's Bugzilla. Mentioned setup was the most similar to our. Kind regards, Arthur Müller