Hi Mathias, On Thu, Feb 29, 2024, Mathias Nyman wrote: > From: Niklas Neronin <niklas.neronin@xxxxxxxxx> > > Variables real & fake port do not convey their purpose, thus they are > replaced with a pointer to the root hub port 'struct xhci_port *rhub_port'. > 'rhub_port' contains real & fake ports in zero-based format, which happens > to be more widely used inside the xHCI driver: > - 'real_port' is ('rhub_port->hw_portnum' + 1) > - 'fake_port' is ('rhub_port->hcd_portnum' + 1) > > One reason for real port being one-based, is to signal other functions in > case struct 'xhci_virt_device' initialization failed, in this case the > value will remain 0. This is no longer needed, instead we check whether > or not 'rhub_port' is 'NULL'. > > Signed-off-by: Niklas Neronin <niklas.neronin@xxxxxxxxx> > Signed-off-by: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> > --- > drivers/usb/host/xhci-hub.c | 4 +++- > drivers/usb/host/xhci-mem.c | 35 ++++++++++++++------------------- > drivers/usb/host/xhci-mtk-sch.c | 14 +++++-------- > drivers/usb/host/xhci-trace.h | 12 +++++------ > drivers/usb/host/xhci.c | 12 +++++------ > drivers/usb/host/xhci.h | 3 +-- > 6 files changed, 35 insertions(+), 45 deletions(-) > > diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c > index 0980ade2a234..37128a777a58 100644 We're getting a NULL pointer dereference bug for this patch. To reproduce this, just unload and reload the xhci driver while a device is connected. It may take a few times to hit the issue. [ 1515.737623] BUG: kernel NULL pointer dereference, address: 000000000000000c [ 1515.737629] #PF: supervisor read access in kernel mode [ 1515.737631] #PF: error_code(0x0000) - not-present page [ 1515.737633] PGD 0 P4D 0 [ 1515.737636] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1515.737639] CPU: 1 PID: 2905 Comm: kworker/1:0 Not tainted 6.9.0-rc2-snps-detached-HEAD-g5bab5dc780c9 #8 [ 1515.737642] Hardware name: System manufacturer System Product Name/PRIME Q370M-C, BIOS 2605 10/16/2019 [ 1515.737643] Workqueue: usb_hub_wq hub_event [ 1515.737648] RIP: 0010:trace_event_raw_event_xhci_log_free_virt_dev+0x64/0xb7 [xhci_hcd] [ 1515.737667] Code: 85 c0 74 59 48 89 58 08 48 8b 53 18 48 89 e7 48 8b 52 10 48 89 50 18 48 8b 53 10 48 8b 52 10 48 89 50 10 48 8b 93 90 11 00 00 <8b> 52 0c 89 50 20 48 8b 93 90 11 00 00 8b 52 08 89 50 24 8b 93 b0 [ 1515.737669] RSP: 0018:ffffac30c0bdbce8 EFLAGS: 00010086 [ 1515.737671] RAX: ffff99e0003309b8 RBX: ffff99e040fb4000 RCX: ffff99e0003309b4 [ 1515.737673] RDX: 0000000000000000 RSI: 0000000000000616 RDI: ffffac30c0bdbce8 [ 1515.737729] RBP: ffff99e007b58a20 R08: 0000000000000002 R09: 0000000000000010 [ 1515.737731] R10: 000000014d33cf30 R11: 0000000000000020 R12: 0000000000000001 [ 1515.737733] R13: ffff99e008c9a294 R14: ffff99e008c9a258 R15: ffff99e004ca5000 [ 1515.737734] FS: 0000000000000000(0000) GS:ffff99e35bc40000(0000) knlGS:0000000000000000 [ 1515.737736] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1515.737738] CR2: 000000000000000c CR3: 00000001024d6004 CR4: 00000000003706f0 [ 1515.737740] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1515.737741] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1515.737743] Call Trace: [ 1515.737746] <TASK> [ 1515.737748] ? __die_body+0x1a/0x5c [ 1515.737751] ? page_fault_oops+0x2ea/0x380 [ 1515.737806] ? sched_clock+0x10/0x23 [ 1515.737808] ? trace_clock_local+0x10/0x23 [ 1515.737812] ? exc_page_fault+0xfe/0x11e [ 1515.737815] ? asm_exc_page_fault+0x26/0x30 [ 1515.737818] ? trace_event_raw_event_xhci_log_free_virt_dev+0x64/0xb7 [xhci_hcd] [ 1515.737832] ? trace_event_raw_event_xhci_log_free_virt_dev+0x39/0xb7 [xhci_hcd] [ 1515.737844] xhci_free_virt_device+0x76/0x1d8 [xhci_hcd] [ 1515.737856] xhci_free_dev+0x111/0x12f [xhci_hcd] [ 1515.737867] hub_event+0xca6/0x137e [ 1515.737870] ? __schedule+0x656/0x69f [ 1515.737873] process_scheduled_works+0x1a4/0x2e3 [ 1515.737876] worker_thread+0x191/0x1e9 [ 1515.737878] ? __pfx_worker_thread+0x10/0x10 [ 1515.737881] kthread+0xe6/0xf1 [ 1515.737883] ? __pfx_kthread+0x10/0x10 [ 1515.737885] ret_from_fork+0x20/0x37 [ 1515.737887] ? __pfx_kthread+0x10/0x10 [ 1515.737889] ret_from_fork_asm+0x1a/0x30 [ 1515.737892] </TASK> To capture the tracepoints and avoid the NULL pointer, I made this change: diff --git a/drivers/usb/host/xhci-trace.h b/drivers/usb/host/xhci-trace.h index 1740000d54c2..b4b20e3f7539 100644 --- a/drivers/usb/host/xhci-trace.h +++ b/drivers/usb/host/xhci-trace.h @@ -172,8 +172,7 @@ DECLARE_EVENT_CLASS(xhci_log_free_virt_dev, __field(void *, vdev) __field(unsigned long long, out_ctx) __field(unsigned long long, in_ctx) - __field(int, hcd_portnum) - __field(int, hw_portnum) + __field(struct xhci_port *, rhub_port) __field(u16, current_mel) ), @@ -181,13 +180,14 @@ DECLARE_EVENT_CLASS(xhci_log_free_virt_dev, __entry->vdev = vdev; __entry->in_ctx = (unsigned long long) vdev->in_ctx->dma; __entry->out_ctx = (unsigned long long) vdev->out_ctx->dma; - __entry->hcd_portnum = (int) vdev->rhub_port->hcd_portnum; - __entry->hw_portnum = (int) vdev->rhub_port->hw_portnum; + __entry->rhub_port = (struct xhci_port *) vdev->rhub_port; __entry->current_mel = (u16) vdev->current_mel; ), TP_printk("vdev %p ctx %llx | %llx hcd_portnum %d hw_portnum %d current_mel %d", __entry->vdev, __entry->in_ctx, __entry->out_ctx, - __entry->hcd_portnum, __entry->hw_portnum, __entry->current_mel + __entry->rhub_port ? __entry->rhub_port->hcd_portnum : -1, + __entry->rhub_port ? __entry->rhub_port->hw_portnum : -1, + __entry->current_mel ) ); You should see this line where the NULL deref happened in the attached log: kworker/0:1-8 [000] d..1. 250.442611: xhci_free_virt_device: vdev 00000000a84fbabe ctx 159dd6000 | 1a7d60000 hcd_portnum -1 hw_portnum -1 current_mel 0 Please review and help fix it. Thanks, Thinh
Attachment:
xhci_null_ptr.tar.gz
Description: xhci_null_ptr.tar.gz