On 28/05/2014 07:26 πμ, Saran Neti wrote:
Hi,
My VIA USB 3.0 controller has stopped working in recent kernels.
During boot,
dmesg shows a WARNING stack trace at drivers/usb/host/xhci-ring.c:1615
containing handle_cmd_completion+0xdc7/0x1000.
USB ports become unusable - mouse, keyboard and fdisking mass storage
devices all spew error messages and stack traces followed by logs of
device resets.
Git-bisection from 3.10 to 3.14 for drivers/usb/host points to
20e7acb13ff48,
which commit message appears to indicate that the VIA controller in
question
may not be xhci spec rev1.0-compliant.
I'm not sure what Linux's policy on regression for non-compliant hardware
is, but it'll save me the trouble of patching/building a kernel each time
if you can revert, or otherwise fix this.
Thanks
Hi Saran,
Thanks for reporting this. I read again the spec and I realized that
using the slot id field of the event TRB for a Reset Device Command was
a mistake.
I misunderstood the following description regarding the slot id field of
Command Completion Event TRB:
"The Slot ID field shall be updated by the xHC to reflect the slot
associated with the
command that generated the event, with the following exceptions:
- The Slot ID shall be cleared to ‘0’ for No Op, Set Latency Tolerance
Value, Get Port
Bandwidth, and Force Event Commands.
- The Slot ID shall be set to the ID of the newly allocated Device Slot
for the Enable Slot
Command.
- The value of Slot ID shall be vendor defined when generated by a
vendor defined command.
This value is used as an index in the Device Context Base Address Array
to select the Device
Context of the source device. If this Event is due to a Host Controller
Command, then this field
shall be cleared to ‘0’."
The xhci spec rev1.0, for the Stop Endpoint Command, the Set TR Dequeue
Pointer Command
and the Reset Endpoint Command, states explicitely that the Command
Completion Event
placed on the Event Ring by the xHC shall have initialized the Slot ID
field to the value of the
command’s Slot ID. However, regarding the Reset Device Command it states
that this field
should be cleared to zero. So, it was my mistake and your hardware is
compatible with the
spec.
Hence, either the patch can be reverted or I can send a patch to replace:
case TRB_RESET_DEV:
WARN_ON(slot_id != TRB_TO_SLOT_ID(
le32_to_cpu(cmd_trb->generic.field[3])));
xhci_handle_cmd_reset_dev(xhci, slot_id, event);
break;
with:
case TRB_RESET_DEV:
slot_id = TRB_TO_SLOT_ID(le32_to_cpu(cmd_trb->generic.field[3]));
xhci_handle_cmd_reset_dev(xhci, slot_id, event);
break;
I cc Sarah Sharp and Mathias Nyman to decide which is the best way to
revert this regression.
regards,
Xenia
--- Bisection details ---
# git bisect log
git bisect start '--' 'drivers/usb/host/'
# good: [8bb495e3f02401ee6f76d1b1d77f3ac9f079e376]
Linux 3.10
git bisect good 8bb495e3f02401ee6f76d1b1d77f3ac9f079e376
# bad: [455c6fdbd219161bd09b1165f11699d6d73de11c]
Linux 3.14
git bisect bad 455c6fdbd219161bd09b1165f11699d6d73de11c
# good: [40b3dc6da05c4ac0e317723a22eaa807c4b98648]
usb: pci-quirks: amd_chipset_sb_type_init() can be static
git bisect good 40b3dc6da05c4ac0e317723a22eaa807c4b98648
# bad: [9b547a882e9ffec67bb41a4e66b4bcc0e91a2737]
usb: r8a66597-hcd: Convert to clk_prepare/unprepare
git bisect bad 9b547a882e9ffec67bb41a4e66b4bcc0e91a2737
# good: [a393a807d0c805e7c723315ff0e88a857055e9c6]
USB: EHCI: start new isochronous streams ASAP
git bisect good a393a807d0c805e7c723315ff0e88a857055e9c6
# bad: [a2cdc3432c361bb885476d1c625e22b518e0bc07]
usb: xhci: remove the unused ->address field
git bisect bad a2cdc3432c361bb885476d1c625e22b518e0bc07
# bad: [20e7acb13ff48fbc884d5918c3697c27de63922a]
xhci: use completion event's slot id rather than dig it out of command
git bisect bad 20e7acb13ff48fbc884d5918c3697c27de63922a
# good: [d194c031994d3fc1038fa09e9e92d9be24a21921]
xhci: correct the usage of USB_CTRL_SET_TIMEOUT
git bisect good d194c031994d3fc1038fa09e9e92d9be24a21921
# good: [b244b431f89e152dd4bf35d71786f1c0eb8cba7e]
xhci: refactor TRB_ENABLE_SLOT case into function
git bisect good b244b431f89e152dd4bf35d71786f1c0eb8cba7e
# good: [9b3103ac9d19525781c297c4fb1e544e077c8901]
xhci: refactor TRB_ADDR_DEV case into function
git bisect good 9b3103ac9d19525781c297c4fb1e544e077c8901
# first bad commit: [20e7acb13ff48fbc884d5918c3697c27de63922a]
xhci: use completion event's slot id rather than dig it out of command
# git show 20e7acb13ff48fbc884d5918c3697c27de63922a
commit 20e7acb13ff48fbc884d5918c3697c27de63922a
Author: Xenia Ragiadakou <burzalodowa@xxxxxxxxx
<mailto:burzalodowa@xxxxxxxxx>>
Date: Mon Sep 9 13:29:50 2013 +0300
xhci: use completion event's slot id rather than dig it out of command
Since the slot id retrieved from the Reset Device TRB matches the
slot id in
the command completion event, which is available, there is no need to
determine
it again.
This patch removes the uneccessary reassignment to slot id and adds a
WARN_ON
in case the two Slot ID fields differ (although according xhci spec
rev1.0
they should not differ).
Signed-off-by: Xenia Ragiadakou <burzalodowa@xxxxxxxxx
<mailto:burzalodowa@xxxxxxxxx>>
Signed-off-by: Sarah Sharp <sarah.a.sharp@xxxxxxxxxxxxxxx
<mailto:sarah.a.sharp@xxxxxxxxxxxxxxx>>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index e3b61b8..88939b7 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1547,9 +1547,9 @@ bandwidth_change:
xhci_handle_cmd_reset_ep(xhci, event,
xhci->cmd_ring->dequeue);
break;
case TRB_TYPE(TRB_RESET_DEV):
+ WARN_ON(slot_id != TRB_TO_SLOT_ID(
+ le32_to_cpu(xhci->cmd_ring->dequeue->generic.field[3])))
xhci_dbg(xhci, "Completed reset device command.\n");
- slot_id = TRB_TO_SLOT_ID(
- le32_to_cpu(xhci->cmd_ring->dequeue->generic.field[3]));
virt_dev = xhci->devs[slot_id];
if (virt_dev)
handle_cmd_in_cmd_wait_list(xhci, virt_dev, event);
--- Problem Details ---
# uname -a
Linux godel 3.15.0-rc7-Saran-00040-gcd79bde #15 SMP PREEMPT \
Tue May 27 23:18:08 EDT 2014 x86_64 GNU/Linux
# git rev-parse HEAD
cd79bde29f00a346eec3fe17c1c5073c37ed95e7
# lspci | grep VIA
02:00.0 USB controller: VIA Technologies, Inc. Device 3483 (rev 01)
# lsusb -v
(...Gets stuck trying to list the following: )
Bus 001 Device 002: ID 2109:3431
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.10
bDeviceClass 9 Hub
bDeviceSubClass 0 Unused
bDeviceProtocol 1 Single TT
bMaxPacketSize0 64
idVendor 0x2109
idProduct 0x3431
bcdDevice 4.20
iManufacturer 0
iProduct 1 USB2.0 Hub
iSerial 0
bNumConfigurations 1
...
Binary Object Store Descriptor:
bLength 5
bDescriptorType 15
wTotalLength 42
bNumDeviceCaps 3
USB 2.0 Extension Device Capability:
bLength 7
bDescriptorType 16
bDevCapabilityType 2
bmAttributes 0x00000002
Link Power Management (LPM) Supported
SuperSpeed USB Device Capability:
bLength 10
bDescriptorType 16
bDevCapabilityType 3
bmAttributes 0x00
wSpeedsSupported 0x000e
Device can operate at Full Speed (12Mbps)
Device can operate at High Speed (480Mbps)
Device can operate at SuperSpeed (5Gbps)
bFunctionalitySupport 1
Lowest fully-functional device speed is Full Speed (12Mbps)
bU1DevExitLat 4 micro seconds
bU2DevExitLat 231 micro seconds
Container ID Device Capability:
bLength 20
bDescriptorType 16
bDevCapabilityType 4
bReserved 0
ContainerID {5cf3ee30-d507-4925-b001-802d79434c30}
[ Stuck ]
# tailf /var/log/everything.log
...
xhci_hcd 0000:02:00.0: Reset device command completion for disabled slot 0
hub 1-1:1.0: hub_port_status failed (err = -110)
xhci_hcd 0000:02:00.0: Timeout while waiting for reset device command
usb 2-2: reset SuperSpeed USB device number 12 using xhci_hcd
xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep
ffff8804f553b240
xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep
ffff8804f553b288
xhci_hcd 0000:02:00.0: Trying to add endpoint 0x81 without dropping it.
usb 2-2: Busted HC? Not enough HCD resources for old configuration.
...
# dmesg
(relevant stuff)
WARNING: CPU: 6 PID: 0 at drivers/usb/host/xhci-ring.c:1615 \
handle_cmd_completion+0xdc7/0x1000 [xhci_hcd]()
Modules linked in: ata_generic usb_storage pata_acpi btrfs \
xor ohci_pci ohci_hcd ehci_pci xhci_hcd ehci_hcd \
pata_atiixp crc32c_intel usbcore usb_common floppy \
raid6_pq sd_mod crc_t10dif crct10dif_common ahci \
libahci libata scsi_mod
CPU: 6 PID: 0 Comm: swapper/6 Not tainted
3.15.0-rc7-Saran-00040-gcd79bde #15
Hardware name: Gigabyte Technology Co., Ltd. GA-78LMT-USB3/GA-78LMT-USB3\
, BIOS FA 04/23/2013
0000000000000009 ffff88052ed83d48 ffffffff814b65a9 0000000000000000
ffff88052ed83d80 ffffffff81065dad 0000000000000000 0000000000000003
ffff88051033f6e0 ffff88051033f080 ffff8805103c8000 ffff88052ed83d90
Call Trace:
<IRQ> [<ffffffff814b65a9>] dump_stack+0x4d/0x6f
[<ffffffff81065dad>] warn_slowpath_common+0x7d/0xa0
[<ffffffff81065e8a>] warn_slowpath_null+0x1a/0x20
[<ffffffffa03d6077>] handle_cmd_completion+0xdc7/0x1000 [xhci_hcd]
[<ffffffff810a34ad>] ? enqueue_task_fair+0x10d/0x5b0
[<ffffffff8109b285>] ? sched_clock_cpu+0xb5/0xe0
[<ffffffffa03d6beb>] xhci_irq+0x5db/0x1ec0 [xhci_hcd]
[<ffffffff81095549>] ? ttwu_do_wakeup+0x19/0xf0
[<ffffffff81097e2f>] ? try_to_wake_up+0x1ff/0x2e0
[<ffffffffa03d84e1>] xhci_msi_irq+0x11/0x20 [xhci_hcd]
[<ffffffff810c126e>] handle_irq_event_percpu+0x3e/0x1f0
[<ffffffff810c145d>] handle_irq_event+0x3d/0x60
[<ffffffff810c3ff6>] handle_edge_irq+0x66/0x130
[<ffffffff81016a5e>] handle_irq+0x1e/0x40
[<ffffffff814c59cd>] do_IRQ+0x4d/0xe0
[<ffffffff814bbc2d>] common_interrupt+0x6d/0x6d
<EOI> [<ffffffff81050196>] ? native_safe_halt+0x6/0x10
[<ffffffff8101deef>] default_idle+0x1f/0x100
[<ffffffff8101e86f>] arch_cpu_idle+0xf/0x20
[<ffffffff810aba18>] cpu_startup_entry+0x258/0x490
[<ffffffff81043224>] start_secondary+0x1f4/0x280
---[ end trace d0b3dfbd98479c47 ]---
--
Saran
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html