KERNEL CRASH when using XHCI devices (affects any architecture, any USB device) This was already reported as a kernel bug in bugzilla (https://bugzilla.kernel.org/show_bug.cgi?id=204257) but I got told to report it here since it is usb related... Affected kernels: 5.2, 5.2.1, 5.2.2, 5.3-rc1, ... This bug is already causing real world problems with existing software and devices using SCSI BOT with raw SCSI commands and libusb software. Reproduce (tested on several different machines with 5.2,5.2.1,5.2.2,5.3rc1): - usb flash drive attached to XHCI controller (e.g. USB3.0 flash drive attached to USB3.0 port) - generic scsi module loaded (e.g. /dev/sg0 comes up when attaching the flash drive) - command line tool "sg_raw" from "sg3-utils" - execute: and press a key + return (-s1 sends one byte which is read from stdin) $ sudo sg_raw -s1 /dev/sg0 00 00 00 00 00 00 00 00 00 00 -> KERNEL Oops - same for -s2, -s3, ... up to -s8 (sending 1 to 8 bytes, exactly the maximum of bytes on my 64 bit machine where the "DMA bypass optimization / IDT" kicks in, see below) Since this can be triggered by any normal user (without any special USB device needed) I think it is important enough to fix it for the existing 5.2 kernel as well. --- Patch introducing the crash: https://patchwork.kernel.org/patch/10919167 / commit 33e39350ebd20fe6a77a51b8c21c3aa6b4a208cf - "usb: xhci: add Immediate Data Transfer support" Reason: NULL pointer dereference --- I took me quite some time to find the cause of this. I narrowed the crash down to the place inside of "xhci_queue_bulk_tx" in "xhci-ring.c" where the next SG is loaded ... while (sg && sent_len >= block_len) { /* New sg entry */ --num_sgs; sent_len -= block_len; if (num_sgs != 0) { sg = sg_next(sg); block_len = sg_dma_len(sg); <================= CRASH The comment of "sg_dma_len" clearly states "These macros should be used after a dma_map_sg call has been done..." - which was omitted by the new "xhci_map_urb_for_dma" function since the transfer was considered suitable for IDT. addr = (u64) sg_dma_address(sg); addr += sent_len; } } block_len -= sent_len; send_addr = addr; ... This only happens if the transfer was cosnideres suitable for IDT. When I patched the function "xhci_urb_suitable_for_idt" to always return false (nothing suitable for IDT) everything was working fine. Unfortunately I'm not deep enough into the inner workings of the kernel usb host driver to find a solution for this other than reverting the patch for IDT. Greetings, Maik Stohn kernel.log: Jul 19 16:43:29 lenovo kernel: BUG: kernel NULL pointer dereference, address: 0000000000000018 Jul 19 16:43:29 lenovo kernel: #PF: supervisor read access in kernel mode Jul 19 16:43:29 lenovo kernel: #PF: error_code(0x0000) - not-present page Jul 19 16:43:29 lenovo kernel: PGD 0 P4D 0 Jul 19 16:43:29 lenovo kernel: Oops: 0000 [#1] PREEMPT SMP PTI Jul 19 16:43:29 lenovo kernel: CPU: 3 PID: 3901 Comm: usb-storage Tainted: P IOE 5.2.1-arch1-1-ARCH #1 Jul 19 16:43:29 lenovo kernel: Hardware name: LENOVO 20287/AILZAZBZC, BIOS 8DCN26WW 09/23/2013 Jul 19 16:43:29 lenovo kernel: RIP: 0010:xhci_queue_bulk_tx+0x285/0x990 [xhci_hcd] Jul 19 16:43:29 lenovo kernel: Code: 77 38 48 85 ed 74 33 48 89 ef eb 0a 48 85 ff 74 26 44 39 e3 72 21 44 29 e3 41 83 ee 01 74 ed e8 71 a2 1e e0 4c 63 d3 48 89 c7 <44> 8b 60 18 4c 03 50 10 > Jul 19 16:43:29 lenovo kernel: RSP: 0018:ffffa7e6c1ce3b20 EFLAGS: 00010046 Jul 19 16:43:29 lenovo kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000410 Jul 19 16:43:29 lenovo kernel: RDX: ffff8b9112353ab0 RSI: 0000000000000010 RDI: 0000000000000000 Jul 19 16:43:29 lenovo kernel: RBP: ffff8b9113c5d570 R08: 0000000000000000 R09: 0000000000000000 Jul 19 16:43:29 lenovo kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jul 19 16:43:29 lenovo kernel: R13: 0000000000000410 R14: 00000000ffffffff R15: 0000000000000001 Jul 19 16:43:29 lenovo kernel: FS: 0000000000000000(0000) GS:ffff8b91572c0000(0000) knlGS:0000000000000000 Jul 19 16:43:29 lenovo kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 19 16:43:29 lenovo kernel: CR2: 0000000000000018 CR3: 0000000053a0a002 CR4: 00000000001606e0 Jul 19 16:43:29 lenovo kernel: Call Trace: Jul 19 16:43:29 lenovo kernel: ? __switch_to_asm+0x41/0x70 Jul 19 16:43:29 lenovo kernel: ? finish_task_switch+0x84/0x2d0 Jul 19 16:43:29 lenovo kernel: ? __switch_to+0x87/0x460 Jul 19 16:43:29 lenovo kernel: xhci_urb_enqueue+0x334/0x5b0 [xhci_hcd] Jul 19 16:43:29 lenovo kernel: usb_hcd_submit_urb+0xc7/0xbb0 Jul 19 16:43:29 lenovo kernel: ? __kmalloc+0x189/0x220 Jul 19 16:43:29 lenovo kernel: ? usb_alloc_urb+0x3e/0x60 Jul 19 16:43:29 lenovo kernel: usb_sg_wait+0x7b/0x150 Jul 19 16:43:29 lenovo kernel: usb_stor_bulk_transfer_sglist.part.0+0x71/0xd0 [usb_storage] Jul 19 16:43:29 lenovo kernel: usb_stor_bulk_srb+0x60/0x90 [usb_storage] Jul 19 16:43:29 lenovo kernel: usb_stor_Bulk_transport+0x179/0x3f0 [usb_storage] Jul 19 16:43:29 lenovo kernel: usb_stor_invoke_transport+0x63/0x520 [usb_storage] Jul 19 16:43:29 lenovo kernel: usb_stor_control_thread+0x233/0x300 [usb_storage] Jul 19 16:43:29 lenovo kernel: kthread+0xfd/0x130 Jul 19 16:43:29 lenovo kernel: ? fill_inquiry_response+0x40/0x40 [usb_storage] Jul 19 16:43:29 lenovo kernel: ? kthread_park+0x90/0x90 Jul 19 16:43:29 lenovo kernel: ret_from_fork+0x35/0x40 Jul 19 16:43:29 lenovo kernel: Modules linked in: hid_generic uas usbhid usb_storage hid rfcomm fuse xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mang>
Attachment:
smime.p7s
Description: S/MIME cryptographic signature