On 11.05.2018 15:11, Jordan Glover wrote:
On May 11, 2018 1:46 PM, Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> wrote:
Hi
On 10.05.2018 14:49, Jordan Glover wrote:
Hello,
Detaching plugged external usb disk with: "udisksctl power-off --block-device <disk>" causes NULL pointer dereference and kernel hang. Tested with 4.17-rc4 on Manjaro Linux config and my own custom config with two different usb disks. It doesn't happen with 4.16.x. Below are logs registered with my own kernel config:
I'm able to reproduce this.
udisksd[1375]: Successfully sent SCSI command SYNCHRONIZE CACHE to /dev/sda
udisksd[1375]: Successfully sent SCSI command START STOP UNIT to /dev/sda
kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
kernel: sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
upowerd[1387]: unhandled action 'unbind' on /sys/devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3:1.0
laptop udisksd[1375]: Powered off /dev/sda - successfully wrote to sysfs path /sys/devices/pci0000:00/0000:00:14.0/usb2/2-3/remove
kernel: usb 2-3: USB disconnect, device number 2
kernel: BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
kernel: RIP: 0010:xhci_hub_control+0x1ee5/0x1ff0 [xhci_hcd]
looks like xhci issue, triggered by speed = xhci->devs[i]->udev->speed in
xhci_find_slot_id_by_port()
xhci->devs[i]->udev seems to be NULL, probably because of commit 44a182b9d177
("xhci: Fix use-after-free in xhci_free_virt_device")
That patch itself fixes another regression, I'll see igf there is a better solution
Thanks
-Mathias
Uh, it's a relief. I was afraid being on my own with this. Looking forward for fix. Thank you.
Ok, I think I found the reason.
Below details are mostly for myself to remember what's going on.
Some testcode found last.
It's a combination of USB3 devices lacking a real "disabled" link state,
udisksctl power-off using the "remove" sysfs entry causing a logical disconnect,
and xhci free_dev being async, not really freeing before returning.
udisksctl power-off uses the "remove" sysfs entry, setting removed bit,calls
logical_disaconnect and sets USB3 device to U3, and kicks hub thread.
Hub thread calls usb_disconnect, and end by calling xhci_free_dev callback.
xhci_free_dev returns before freeing anything. It queues xhci slot disabling commands,
which when complete will free the slot (set xhci->devs[i] to NULL).
Before xhci->devs[i] is set to NULL the hub thread notices the port is enabled, (U3)
and will try to disable it once again, setting it to U3 again.
xhci->devs[i]->udev will be NULL in 4.17-rc4 here, but we only check xhci->devs[i]
before setting U3, and hence trigger the NULL pointer dereference.
xhci->devs[i]->udev is a better, earlier indicator to check if xhci slot is about to
go be disabled and freed, and simply checking it as well makes sense.
Some usb core change in how we handle the whole thing might make sense, but to fix this,
I think that just adding the below code should be enough for 4.17
diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 72ebbc9..32cd52c 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -354,7 +354,7 @@ int xhci_find_slot_id_by_port(struct usb_hcd *hcd, struct xhci_hcd *xhci,
slot_id = 0;
for (i = 0; i < MAX_HC_SLOTS; i++) {
- if (!xhci->devs[i])
+ if (!xhci->devs[i] || !xhci->devs[i]->udev)
continue;
speed = xhci->devs[i]->udev->speed;
if (((speed >= USB_SPEED_SUPER) == (hcd->speed >= HCD_USB3))
-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html