On 2019-03-22 12:58 p.m., John Donnelly wrote: > Hello , > > I am investigating a issue reported by a test group concerning this driver. Their test loads and unloads every kernel module included in the 4.14.35 kernel release . You don’t even need a AMD platform . It occurs on any Intel, or a KVM VM instance too. > > Kernel panic while “ modprobe amdkfd ; modprobe -r amdkfd “ > > [ 329.425334] ? __slab_free+0x9b/0x2ba > [ 329.427836] ? process_slab+0x3c1/0x45c > [ 329.430336] dev_printk_emit+0x4e/0x65 > [ 329.432829] __dev_printk+0x46/0x8b > [ 329.435183] _dev_info+0x6c/0x85 > [ 329.437435] ? kfree+0x141/0x182 > [ 329.439646] kfd_module_exit+0x37/0x39 [amdkfd] > [ 329.442258] SyS_delete_module+0x1c3/0x26f > [ 329.444722] ? entry_SYSCALL_64_after_hwframe+0xaa/0x0 > [ 329.447479] ? entry_SYSCALL_64_after_hwframe+0xa3/0x0 > [ 329.450206] ? entry_SYSCALL_64_after_hwframe+0x9c/0x0 > [ 329.452912] ? entry_SYSCALL_64_after_hwframe+0x95/0x0 > [ 329.455586] do_syscall_64+0x79/0x1ae > [ 329.457766] entry_SYSCALL_64_after_hwframe+0x151/0x0 > [ 329.460369] RIP: 0033:0x7f1757a1b457 > [ 329.462502] RSP: 002b:00007ffd62ce1f48 EFLAGS: 00000206 ORIG_RAX: > > > > Sometimes the unload works but the message logged is garbage: > > [root@jpd-vmbase02 ~]# modprobe -r amdkfd > [ 144.449981] ???????????? hn??蟟??xn??ן??kfd: Removed module I think this was caused by using dev_info with a kfd_device that didn't exist any more. It was fixed by this commit: commit c393e9b2d51540b74e18e555df14706098dbf2cc Author: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> Date: Mon Nov 13 18:08:48 2017 +0200 drm/amdkfd: fix amdkfd use-after-free GP fault Fix GP fault caused by dev_info() reference to a struct device* after the device has been freed (use after free). kfd_chardev_exit() frees the device so 'kfd_device' should not be used after calling kfd_chardev_exit(). Signed-off-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> Signed-off-by: Oded Gabbay <oded.gabbay@xxxxxxxxx> > > > Is this something one of team members could have possibly corrected in an upstream version ? In current kernels, amdkfd is no longer a separate KO. It's part of amdgpu now. Also see above. This bug is probably not reproducible any more. Regards, Felix > > #define KFD_DRIVER_DESC "Standalone HSA driver for AMD's GPUs" > #define KFD_DRIVER_DATE "20150421" > #define KFD_DRIVER_MAJOR 0 > #define KFD_DRIVER_MINOR 7 > #define KFD_DRIVER_PATCHLEVEL 2 > > > Any advise welcome. > > > Thank you, > > John > _______________________________________________ > amd-gfx mailing list > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx