Re: Mass Storage Gadget Device Falls from SuperSpeed to High Speed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Rob Weber <rob@xxxxxxxxxxx> writes:

<snip>

>> > dmesg output is short (4 lines) and is probably helpful to find the
>> > meaningful timestamps of the dwc3 traces. The sequence of events on the
>> > device seems to start around 1346 seconds and ends around 1436 seconds.
>> 
>> right, from device perspective I see things behaving normally, then a
>> "random" reset shows up.
>
> In what way does this show up? Does the host send a reset command, does
> the device request the host port to reset, or does the device appear to
> reset itself?

this IRQ is generated based on Reset Signaling driven on the bus by the
host.

>> > Please note the "dwc3.1.auto: failed to enable ep0out" errors that are
>> > occurring when we connect our device to a host. These errors occur
>> > pretty regularly and we are not sure what is causing them.
>> 
>> this shouldn't happen, actually; where do you see those?
>
> They are in the dmesg output on the device. I was assuming they line up

huh? Here's what I have on device side dmesg:

[ 1345.662238] dwc3 dwc3.1.auto: request ffff880177cc4e40 was not queued to ep1out
[ 1346.581055] g_mass_storage gadget: super-speed config #1: Linux File-Backed Storage
[ 1435.982257] dwc3 dwc3.1.auto: request ffff880177cc4d80 was not queued to ep1out
[ 1436.264087] g_mass_storage gadget: high-speed config #1: Linux File-Backed Storage

That's the entire thing.

> with tracepoints in the dwc traces, but I wasn't sure what to look for.
> This error occurs approximately 75% of the time while the device is
> enumerating.

odd. Really shouldn't happen.

>> Anyway, here's the important part from dwc3 traces:
>> 
>> >  irq/23-dwc3-995   [000] d..1  1422.081846: dwc3_complete_trb: ep1in: 1/2 trb ffff8800364327d0 buf 00000000748ff000 size 0 ctrl 00000c18 (hlcS:SC:normal)
>> >  irq/23-dwc3-995   [000] d..1  1422.081853: dwc3_gadget_giveback: ep1in: req ffff880177cc4e40 length 16384/16384 zsI ==> 0
>> 
>> completed 16kiB transfer block
>> 
>> >  irq/23-dwc3-995   [000] d..1  1422.081890: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [196615] params 00000000 00000000 00000000 --> status: Successful
>> > file-storage-994   [001] d..1  1422.081917: dwc3_ep_queue: ep1in: req ffff880177cc4e40 length 0/13 zsI ==> -115
>> > file-storage-994   [001] d..1  1422.081928: dwc3_gadget: ep1in: req ffff880177cc4e40 dma 74905000 length 13
>> > file-storage-994   [001] d..1  1422.081938: dwc3_prepare_trb: ep1in: 2/2 trb ffff8800364327f0 buf 0000000074905000 size 13 ctrl 00000c19 (HlcS:SC:normal)
>> > file-storage-994   [001] d..1  1422.081972: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [196615] params 00000000 00000000 00000000 --> status: Successful
>> >  irq/23-dwc3-995   [000] d..1  1422.081980: dwc3_event: event (00100301): Link Change [U0]
>> >  irq/23-dwc3-995   [000] d..1  1422.081988: dwc3_event: event (00110301): Link Change [U1]
>> >  irq/23-dwc3-995   [000] d..1  1422.081997: dwc3_event: event (00004086): ep1in: Transfer In-Progress
>> >  irq/23-dwc3-995   [000] d..1  1422.082000: dwc3_complete_trb: ep1in: 1/2 trb ffff8800364327e0 buf 0000000074903000 size 0 ctrl 00000c18 (hlcS:SC:normal)
>> >  irq/23-dwc3-995   [000] d..1  1422.082008: dwc3_gadget_giveback: ep1in: req ffff880177cc4b40 length 8192/8192 zsI ==> 0
>> 
>> 8kiB block
>> 
>> >  irq/23-dwc3-995   [000] d..1  1422.082048: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [196615] params 00000000 00000000 00000000 --> status: Successful
>> > file-storage-994   [001] d..1  1422.082089: dwc3_ep_queue: ep1out: req ffff880177cc4d80 length 0/1024 zsI ==> -115
>> > file-storage-994   [001] d..1  1422.082101: dwc3_gadget: ep1out: req ffff880177cc4d80 dma 74905800 length 1024
>> > file-storage-994   [001] d..1  1422.082111: dwc3_prepare_trb: ep1out: 4/2 trb ffff880036423780 buf 0000000074905800 size 1024 ctrl 00000c19 (HlcS:SC:normal)
>> > file-storage-994   [001] d..1  1422.082144: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
>> >  irq/23-dwc3-995   [000] d..1  1422.082147: dwc3_event: event (00100301): Link Change [U0]
>> >  irq/23-dwc3-995   [000] d..1  1422.082155: dwc3_event: event (00110301): Link Change [U1]
>> >  irq/23-dwc3-995   [000] d..1  1422.082160: dwc3_event: event (00100301): Link Change [U0]
>> >  irq/23-dwc3-995   [000] d..1  1422.082167: dwc3_event: event (00110301): Link Change [U1]
>> >  irq/23-dwc3-995   [000] d..1  1422.082173: dwc3_event: event (00004086): ep1in: Transfer In-Progress
>> >  irq/23-dwc3-995   [000] d..1  1422.082176: dwc3_complete_trb: ep1in: 0/2 trb ffff8800364327f0 buf 0000000074905000 size 0 ctrl 00000c18 (hlcS:SC:normal)
>> >  irq/23-dwc3-995   [000] d..1  1422.082181: dwc3_gadget_giveback: ep1in: req ffff880177cc4e40 length 13/13 zsI ==> 0
>> 
>> CSW
>
> What is CSW?

Command Status Wrapper. Every mass storage transaction starts with a CBW
(Command block transfer) of 31 bytes and ends with a CSW (command status
transfer) of 13 bytes. Between them we may or may not have data transfers.

>> >  irq/23-dwc3-995   [000] d..1  1422.092840: dwc3_event: event (00120301): Link Change [U2]
>> >  irq/23-dwc3-995   [000] d..1  1427.800204: dwc3_event: event (00160301): Link Change [SS.Inactive]
>> >  irq/23-dwc3-995   [000] d..1  1427.813205: dwc3_event: event (00150301): Link Change [RX.Detect]
>> >  irq/23-dwc3-995   [000] d..1  1427.911837: dwc3_event: event (00040301): Link Change [SS.Disabled]
>> >  irq/23-dwc3-995   [000] d..1  1427.911849: dwc3_event: event (00000301): Link Change [U0]
>> >  irq/23-dwc3-995   [000] d..1  1427.914818: dwc3_event: event (00050301): Link Change [RX.Detect]
>> >  irq/23-dwc3-995   [000] d..1  1427.917746: dwc3_event: event (00030301): Link Change [U3]
>> 
>> link went to U3. Why would xhci put the link to U3 now? We're still
>> working fine.
>
> This looks like a pretty crazy set of transitions. I'm unfamiliar with
> LPM. Is this sequence of events normal for a transition to U3?

See figure 7-14 on USB 3.1 spec. This shows that from U2 we had LFPS
timeout which forced link to SS.Inactive. Then, link *must* go through
RX.Detect to restart a session. From there, a condition called RX Detect
Overlimit happens, which forces the port to SS.Disabled.

Link is, then, moved to U0 by the host. Then RX.Detect, then U3. This is
all under host side control. DWC3 is simply reporting the states.

> During this particular run, I was not performing any operations on the
> mass storage device besides mounting it on my computer. No reading or
> writing besides that. Would this sort of inactivity have something to do
> with the transition to U3?

No. XHCI shouldn't put anything into U3 by default. Mathias can give
more details here.

>> >  irq/23-dwc3-995   [000] d..1  1435.982020: dwc3_event: event (00000101): Reset [U0]
>> 
>> 8 seconds later, XHCI drivers a bus Reset
>> 
>> >  irq/23-dwc3-995   [000] d..1  1435.982116: dwc3_gadget_ep_cmd: ep1out: cmd 'End Transfer' [133384] params 00000000 00000000 00000000 --> status: Successful
>> > file-storage-994   [001] ....  1435.982186: dwc3_ep_dequeue: ep1out: req ffff880177cc4d80 length 0/1024 zsI ==> -115
>> >  irq/23-dwc3-995   [000] d..1  1435.982230: dwc3_gadget_giveback: ep1out: req ffff880177cc4d80 length 0/1024 zsI ==> -108
>> >  irq/23-dwc3-995   [000] d..1  1435.990478: dwc3_gadget_ep_cmd: ep1in: cmd 'End Transfer' [198920] params 00000000 00000000 00000000 --> status: Successful
>> > file-storage-994   [001] ....  1435.990596: dwc3_free_request: ep1in: req ffff880177cc4b40 length 8192/8192 zsI ==> 0
>> > file-storage-994   [001] ....  1435.990607: dwc3_free_request: ep1out: req ffff880177cc4d80 length 0/1024 zsI ==> -108
>> > file-storage-994   [001] ....  1435.990612: dwc3_free_request: ep1in: req ffff880177cc4e40 length 13/13 zsI ==> 0
>> > file-storage-994   [001] ....  1435.990616: dwc3_free_request: ep1out: req ffff880177cc4f00 length 31/1024 zsI ==> 0
>> > file-storage-994   [001] d..1  1435.990622: dwc3_gadget: Disabling ep1in
>> > file-storage-994   [001] d..1  1435.990641: dwc3_gadget: Disabling ep1out
>> >  irq/23-dwc3-995   [000] d..1  1435.990710: dwc3_event: event (080201c4): ep1out: Endpoint Command Complete
>> >  irq/23-dwc3-995   [000] d..1  1435.990718: dwc3_event: event (080301c6): ep1in: Endpoint Command Complete
>> 
>> All endpoints are disabled and transfers are cancelled.
>> 
>> >  irq/23-dwc3-995   [000] d..1  1436.038946: dwc3_event: event (00000201): Connection Done [U0]
>> >  irq/23-dwc3-995   [000] d..1  1436.038987: dwc3_gadget: Enabling ep0out
>> >  irq/23-dwc3-995   [000] d..1  1436.039037: dwc3_gadget_ep_cmd: ep0out: cmd 'Set Endpoint Configuration' [1] params 80000200 00000500 00000000 --> status: Successful
>> 
>> new enumeration in HS. The first parameter here contains the packet size
>> in the first 16 bits:
>> 
>> 0x0200 >> 3 = 64
>> 
>> >  irq/23-dwc3-995   [000] d..1  1436.039053: dwc3_gadget: Enabling ep0in
>> >  irq/23-dwc3-995   [000] d..1  1436.039095: dwc3_gadget_ep_cmd: ep0in: cmd 'Set Endpoint Configuration' [1] params 80000200 02000500 00000000 --> status: Successful
>> >  irq/23-dwc3-995   [000] d..1  1436.039114: dwc3_event: event (00000301): Link Change [U0]
>> >  irq/23-dwc3-995   [000] d..1  1436.109708: dwc3_event: event (0000c040): ep0out: Transfer Complete
>> >  irq/23-dwc3-995   [000] d..1  1436.109722: dwc3_ep0: ep0out: Transfer Complete: state 'Setup Phase'
>> >  irq/23-dwc3-995   [000] d..1  1436.109728: dwc3_ep0: Setup Phase
>> 
>> Transfers continue in HW after this.
>> 
>> > Lastly, I've been able to recreate this event while monitoring the
>> > activity with a USB protocol analyzer. I've attached a screenshot of the
>> > USB transactions that happen around the time of the drop from SuperSpeed
>> > to high speed. I can also provide an archive containing the raw output
>> > that can be viewed in the "USB Protocol Suite" software provided by
>> > Teledyne LaCroy if you're interested. The raw data is too large to
>> > include on this email.
>> 
>> Sure, that would be great. How can I download it? Any link which I can
>> curl? You can send it to me only, if you want; no problem.
>
> You (and anyone reading) can download it here:
> https://drive.google.com/file/d/1WSJ-222bguXTsRZ-mI5u2M7IP9FsZdj7/view?usp=sharing

I'll download it and have a look, thanks

>> > I would appreciate any help or guidance anyone could provide in
>> > understanding more about this issue. We really want to understand why
>> > we're transitioning from superspeed to high speed and whether or not
>> > this issue is caused by software or hardware. I understand it might be
>> > difficult for someone to help with a custom design and an old kernel,
>> > but any pointers for further debugging would be incredibly helpful.
>> > Thank you in advance for taking a look at all this data!
>> 
>> Sure thing, I'll ask Mathias (XHCI maintainer) for his help on this too
>> since it's unclear what's causing the switch to HS when exitting U3.
>
> Thanks in advance for taking a look, Mathias! If there is any more
> information you need from me, please let me know.

XHCI tracepoints (together with dynamic printk of xhci) would be
great. Another thing to try is disabling LPM. It could be that your
redriver is one of those in the set of "can't handle LPM" redrivers.

That's an easy thing to try. I suppose below is possibly enough:

modified   drivers/usb/host/xhci-pci.c
@@ -154,7 +154,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
 		xhci->quirks |= XHCI_U2_DISABLE_WAKE;
 
 	if (pdev->vendor == PCI_VENDOR_ID_INTEL) {
-		xhci->quirks |= XHCI_LPM_SUPPORT;
+		/* xhci->quirks |= XHCI_LPM_SUPPORT; */
 		xhci->quirks |= XHCI_INTEL_HOST;
 		xhci->quirks |= XHCI_AVOID_BEI;
 	}

-- 
balbi

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux