What about a much newer kernel, like 6.0 or so? Paolo On Thu, Nov 24, 2022 at 7:18 AM Ashish Gupta (SJC) <ashish.gupta1@xxxxxxxxxxx> wrote: > > Hi Paolo, > > With v5.10.155 also, it failed in similar way. > > > > [root@ahvgpu04-1 ~]# uname -r > > 5.10.155-2.el7.nutanix.20220304.242.x86_64 > > > > > > Logs from guest vm. > > [ 113.669214] NVRM: GPU at PCI:0000:00:06: GPU-fcdeaa4c-664a-4de8-2e32-23e14628ce8c > > [ 113.669215] NVRM: GPU Board Serial Number: 1651522000466 > > [ 113.669216] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function FREE (0x0 0x0). > > [ 113.669384] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function FREE (0x0 0x0). > > [ 113.669400] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function FREE (0x0 0x0). > > [ 113.669498] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function FREE (0x0 0x0). > > [ 113.669609] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function GSP_RM_CONTROL (0x20800a70 0x0). > > [ 113.669615] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function GSP_RM_CONTROL (0x20800a6c 0x4). > > [ 113.670156] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function GSP_RM_CONTROL (0x6 0x0). > > [ 113.670247] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function FREE (0x0 0x0). > > [ 113.670338] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function GSP_RM_CONTROL (0x20800a38 0x18). > > [ 113.672663] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function UPDATE_BAR_PDE (0x0 0x0). > > [ 113.672702] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function FREE (0x0 0x0). > > [ 113.672709] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function FREE (0x0 0x0). > > [ 113.672787] NVRM: Xid (PCI:0000:00:06): 119, pid='<unknown>', name=<unknown>, Timeout waiting for RPC from GSP! Expected function UNLOADING_GUEST_DRIVER (0x0 0x0). > > [ 113.674376] NVRM: GPU 0000:00:06.0: RmInitAdapter failed! (0x11:0x45:2540) > > [ 113.675130] NVRM: GPU 0000:00:06.0: rm_init_adapter failed, device minor number 0 > > [ 113.850458] NVRM: GPU 0000:00:06.0: RmInitAdapter failed! (0x22:0x56:731) > > [ 113.851206] NVRM: GPU 0000:00:06.0: rm_init_adapter failed, device minor number 0 > > > > Regards, > > --Ashish Gupta > > > > From: Ashish Gupta (SJC) <ashish.gupta1@xxxxxxxxxxx> > Date: Wednesday, November 23, 2022 at 5:49 PM > To: Paolo Bonzini <pbonzini@xxxxxxxxxx>, kvm@xxxxxxxxxxxxxxx <kvm@xxxxxxxxxxxxxxx> > Cc: seanjc@xxxxxxxxxx <seanjc@xxxxxxxxxx>, John Levon <john.levon@xxxxxxxxxxx> > Subject: Re: Nvidia GPU PCI passthrough and kernel commit #5f33887a36824f1e906863460535be5d841a4364 > > > Have you tested with a more recent version than 5.10.x, to see if the > > bug is still there? > > > Building image with v5.10.155. > > I am hoping to get result in 2-3H, I will update thread. > > > > Regards, > > --Ashish Gupta > > From: Paolo Bonzini <pbonzini@xxxxxxxxxx> > Date: Wednesday, November 23, 2022 at 5:39 PM > To: Ashish Gupta (SJC) <ashish.gupta1@xxxxxxxxxxx>, kvm@xxxxxxxxxxxxxxx <kvm@xxxxxxxxxxxxxxx> > Cc: seanjc@xxxxxxxxxx <seanjc@xxxxxxxxxx>, John Levon <john.levon@xxxxxxxxxxx> > Subject: Re: Nvidia GPU PCI passthrough and kernel commit #5f33887a36824f1e906863460535be5d841a4364 > > On 11/24/22 01:56, Ashish Gupta (SJC) wrote: > > Nutanix uses KVM based hypervisor, which is called AHV (Acropolis > > Hypervisor). > > > > latest AHV release is based on kernel v5.10.117. where we found that > > Nvidia GPU cards (10/A30/A40 etc) stopped working. > > > > Guest VM (based on centos7 or Ubuntu 16.10) were able to identify card > > but after installing Nvidia Grid driver we were seeing following logs in > > guest vm. > > > > Have you tested with a more recent version than 5.10.x, to see if the > bug is still there? > > Paolo