On Mon, Mar 27, 2023 at 11:30:20PM -0700, YoungJun.Park wrote: > I got a panic dump which happend on kernel( > I check the both of CentOS and mainline kernel > And I assume it could be happend on mainline kernel. > (like I said assume) Note, CentOS uses a very old and obsolete kernel, nothing we can do about that mess, please get support for that from the company that provided it. > I think xhci-debugfs file creation check the creation of > NULL parent dentry. Is it right? (drivers/usb/host/xhci-debugfs.c) Why is that needed? > The contents below is the what I analyze. > > 1. A lot of usb error > ... > [1994159.974407] usb 1-2: device descriptor read/64, error -71 > [1994160.187902] usb 1-2: new low-speed USB device number 36 using xhci_hcd > [1994160.302634] usb 1-2: device descriptor read/64, error -71 > [1994160.562027] usb 1-2: device descriptor read/64, error -71 > [1994160.663789] usb usb1-port2: unable to enumerate USB device > [1994162.256029] usb 1-2: new low-speed USB device number 37 using xhci_hcd > [1994162.370764] usb 1-2: device descriptor read/64, error -71 > [1994162.585258] usb 1-2: device descriptor read/64, error -71 > [1994162.797751] usb 1-2: new low-speed USB device number 38 using xhci_hcd > [1994162.912484] usb 1-2: device descriptor read/64, error -71 > [1994163.141944] usb 1-2: device descriptor read/64, error -71 > [1994163.243712] usb usb1-port2: attempt power cycle. > ... That has nothing to do with debugfs. > 2. After that panic happend and I got a dump. > The cause of panic reading "/sys/kernel/debug/trbs" which is abnormal. > (which have bad xhci_ring private data) > the file must be on the "/sys/kernel/debug/usb/~~~". > > sh> bt > PID: 91416 TASK: ffff8d68fcc54200 CPU: 1 COMMAND: "fbmons" > #0 [ffff8d679a6cbad0] machine_kexec at ffffffff9e2662c4 > #1 [ffff8d679a6cbb30] __crash_kexec at ffffffff9e322a32 > #2 [ffff8d679a6cbc00] crash_kexec at ffffffff9e322b20 > #3 [ffff8d679a6cbc18] oops_end at ffffffff9e98d798 > #4 [ffff8d679a6cbc40] no_context at ffffffff9e275d14 > #5 [ffff8d679a6cbc90] __bad_area_nosemaphore at ffffffff9e275fe2 > #6 [ffff8d679a6cbce0] bad_area_nosemaphore at ffffffff9e276104 > #7 [ffff8d679a6cbcf0] __do_page_fault at ffffffff9e990750 > #8 [ffff8d679a6cbd60] do_page_fault at ffffffff9e990975 > #9 [ffff8d679a6cbd90] page_fault at ffffffff9e98c778 > [exception RIP: xhci_ring_trb_show+29] > RIP: ffffffff9e76005d RSP: ffff8d679a6cbe40 RFLAGS: 00010246 > RAX: ffff8d497b0fe018 RBX: 0000000000000000 RCX: 0000001309207f9b > RDX: fffffffffffffff4 RSI: 0000000000000001 RDI: ffff8d6939b8fd40 > RBP: ffff8d679a6cbe60 R8: ffff8d49ffa5f1a0 R9: ffff8d3b3fc07300 > R10: ffff8d3b3fc07300 R11: ffffffff9e3de9fd R12: 0000000000000000 > R13: 0000000000000000 R14: ffff8d6939b8fd40 R15: ffff8d6939b8fd40 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #10 [ffff8d679a6cbe68] seq_read at ffffffff9e476d10 > #11 [ffff8d679a6cbed8] vfs_read at ffffffff9e44e3ff > #12 [ffff8d679a6cbf08] sys_read at ffffffff9e44f27f > #13 [ffff8d679a6cbf50] system_call_fastpath at ffffffff9e995f92 > RIP: 00007f5f9749f6fd RSP: 00007f5f84c73700 RFLAGS: 00000293 > RAX: 0000000000000000 RBX: 00007f5f7066a390 RCX: ffffffffffffffff > RDX: 0000000000000400 RSI: 00007f5f706d2270 RDI: 0000000000000016 > RBP: 00007f5f706d2270 R8: 00007f5f953e73a0 R9: 00007f5f953e7388 > R10: 0000000000000040 R11: 0000000000000293 R12: 0000000000000400 > R13: 00007f5f84c7354c R14: 0000000022100004 R15: 00007f5f866c5368 > ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b So the xhci driver is crashing here? > 3. I found a another abnormal xhci debugfs file ep_context in dump. > Check xhci_slot_priv is alive and find the root is NULL. > > crash> files -d 0xffff8d68fe8f0000 > DENTRY INODE SUPERBLK TYPE PATH > ffff8d68fe8f0000 ffff8d489e979e10 ffff8d3b19169800 REG /sys/kernel/debug/ep-context > > crash > struct xchi_slot_priv 0xffff8d4812492c0 > struct xhci_slot_priv { > ... > root = 0x0, > ... > dev = 0xffff8d497b0fe000 > } > > 4. Looking into the mainline kernel code, I finally concluded that > /sys/fs/debug/"interface file" could be made. > > drivers/usb/host/xhci-debugfs.c > xhci_debugfs_create_slot function > ... > priv->root = debugfs_create_dir(priv->name, xhci->debugfs_slots); // root can be NULL How is root NULL here? What caused that to happen? Shouldn't you fix that issue here, that's not a general debugfs problem. thanks, greg k-h