Re: g_mass_storage bug ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 24, 2014 at 01:53:31PM -0400, Alan Stern wrote:
> On Wed, 24 Sep 2014, Felipe Balbi wrote:
> 
> > > I'll capture usbmon and send here shortly.
> > 
> > here it is... Interesting part starts at line 73 (114 on this email)
> > where the data transport received EPIPE (due to Stall). This time
> > however, I was eventually able to talk to the device and managed to
> > issue quite a few writes to it.
> 
> Here's where the unexpected stuff begins:
> 
> > ed2541c0 1237463240 S Bo:003:01 -115 31 = 55534243 06000000 c0000000 8000061a 003f00c0 00000000 00000000 000000
> > ed2541c0 1237463431 C Bo:003:01 0 31 >
> > ec1a8540 1237463873 S Bi:003:01 -115 192 <
> > ec1a8540 1237464053 C Bi:003:01 -32 0
> > ed2541c0 1237464158 S Co:003:00 s 02 01 0000 0081 0000 0
> > ed2541c0 1237464359 C Co:003:00 0 0
> > ed2541c0 1237468607 S Bi:003:01 -115 13 <
> > ed2541c0 1237468802 C Bi:003:01 -75 0
> 
> This is the first MODE SENSE command.  The gadget should send as much
> data as it can before halting the bulk-IN endpoint.  Instead, the
> endpoint was halted first.
> 
> Then, after the host cleared the halt, the gadget apparently sent the
> data that _should_ have been sent previously.  The host was expecting
> to receive the CSW at this point, so there was an overflow error.
> That's what caused the host to perform a reset.
> 
> Evidently this UDC implements the set_halt method incorrectly.  
> According to the kerneldoc for usb_ep_set_halt:
> 
>  * Attempts to halt IN endpoints will fail (returning -EAGAIN) if any
>  * transfer requests are still queued, or if the controller hardware
>  * (usually a FIFO) still holds bytes that the host hasn't collected.

damn old bugs :-) I'll fix that up and Cc stable.

> This didn't happen; the endpoint was halted before the host collected 
> the pending data.
> 
> Incidentally, even though the URB completed with -EOVERFLOW status, we
> still should see the first 13 bytes of data (i.e., the portion that
> could fit into the data buffer).  But the actual_length value is 0.  I
> don't know if this is a quirk of the xHCI hardware or a bug in
> xhci-hcd.
> 
> > ec1a8540 1237469361 S Co:001:00 s 23 03 0004 0001 0000 0
> > ec1a8540 1237471551 C Co:001:00 0 0
> > ed2209c0 1237534064 S Ci:001:00 s a3 00 0000 0001 0004 4 <
> > ed2209c0 1237537012 C Ci:001:00 0 4 = 03051000
> > ed2209c0 1237594113 S Co:001:00 s 23 01 0014 0001 0000 0
> > ed2209c0 1237595037 C Co:001:00 0 0
> > ed2209c0 1237595434 S Co:001:00 s 23 01 0001 0001 0000 0
> > ed2209c0 1237597480 C Co:001:00 0 0
> 
> Immediately after resetting the port, the host disabled it.  No
> indication of why.
> 
> > ed2209c0 1237597823 S Co:001:00 s 23 03 0004 0001 0000 0
> > ed2209c0 1237597890 C Co:001:00 0 0
> > ed2209c0 1237654005 S Ci:001:00 s a3 00 0000 0001 0004 4 <
> > ed2209c0 1237654098 C Ci:001:00 0 4 = 03051000
> > ed2209c0 1237714084 S Co:001:00 s 23 01 0014 0001 0000 0
> > ed2209c0 1237714151 C Co:001:00 0 0
> > ed2209c0 1237715894 S Co:001:00 s 23 01 0001 0001 0000 0
> > ed2209c0 1237715985 C Co:001:00 0 0
> 
> Another reset followed by another port disable.
> 
> > ed2209c0 1237716244 S Co:001:00 s 23 03 0004 0001 0000 0
> > ed2209c0 1237716308 C Co:001:00 0 0
> > ed2209c0 1237774094 S Ci:001:00 s a3 00 0000 0001 0004 4 <
> > ed2209c0 1237775327 C Ci:001:00 0 4 = 03051000
> > ed2209c0 1237834107 S Co:001:00 s 23 01 0014 0001 0000 0
> > ed2209c0 1237834183 C Co:001:00 0 0
> > ed2209c0 1237854094 S Ci:003:00 s 80 06 0100 0000 0008 8 <
> > ed2209c0 1237854455 C Ci:003:00 0 8 = 12011002 00000040
> > ed2209c0 1237854963 S Ci:003:00 s 80 06 0100 0000 0012 18 <
> > ed2209c0 1237855219 C Ci:003:00 0 18 = 12011002 00000040 2505a5a4 17030304 0001
> > ed2209c0 1237855544 S Ci:003:00 s 80 06 0f00 0000 0005 5 <
> > ed2209c0 1237855771 C Ci:003:00 0 5 = 050f1600 02
> > ed2209c0 1237856062 S Ci:003:00 s 80 06 0f00 0000 0016 22 <
> > ed2209c0 1237856265 C Ci:003:00 0 22 = 050f1600 02071002 02000000 0a100300 0f000101 f401
> > ed2209c0 1237856548 S Ci:003:00 s 80 06 0200 0000 0020 32 <
> > ed2209c0 1237858430 C Ci:003:00 0 32 = 09022000 010100c0 01090400 00020806 50010705 81020002 00070501 02000201
> > ed2200c0 1237860245 S Co:003:00 s 00 09 0001 0000 0000 0
> > ed2200c0 1237861785 C Co:003:00 0 0
> 
> Then a normal reset, like we should have seen originally.  I have no
> idea what was going on.  Maybe something involving warm vs. cold 
> resets?
> 
> > ed2541c0 1237875505 S Bo:003:01 -115 31 = 55534243 07000000 c0000000 8000061a 003f00c0 00000000 00000000 000000
> > ed2541c0 1237875778 C Bo:003:01 0 31 >
> > ed2200c0 1237876448 S Bi:003:01 -115 192 <
> > ed2200c0 1237876534 C Bi:003:01 -32 0
> > ed2541c0 1237876703 S Co:003:00 s 02 01 0000 0081 0000 0
> > ed2541c0 1237876883 C Co:003:00 0 0
> > ed2541c0 1237876987 S Bi:003:01 -115 13 <
> > ed2541c0 1237877114 C Bi:003:01 0 0
> > ed2541c0 1237877486 S Bi:003:01 -115 13 <
> > ed2541c0 1237877572 C Bi:003:01 0 13 = 55534253 07000000 c0000000 01
> 
> Here the MODE SENSE command was sent again, and this time the gadget
> responded in a way that the host could accept.  I think it still
> wasn't _right_, because it appears the gadget tried to send a 0-length
> reply after the reset.  But at least it didn't provoke another reset.
> 
> > ed2541c0 1237877915 S Bo:003:01 -115 31 = 55534243 08000000 12000000 80000603 00000012 00000000 00000000 000000
> > ed2541c0 1237878041 C Bo:003:01 0 31 >
> > ed2200c0 1237878203 S Bi:003:01 -115 18 <
> > ed2200c0 1237878318 C Bi:003:01 0 18 = 70000600 0000000a 00000000 29000000 0000
> > ed2541c0 1237878415 S Bi:003:01 -115 13 <
> 
> Here's the Unit Attention status.
> 
> > ed2541c0 1237878467 C Bi:003:01 0 13 = 55534253 08000000 00000000 00
> > ed2541c0 1237878796 S Bo:003:01 -115 31 = 55534243 09000000 c0000000 8000061a 003f00c0 00000000 00000000 000000
> > ed2541c0 1237878961 C Bo:003:01 0 31 >
> > ed2200c0 1237879151 S Bi:003:01 -115 192 <
> > ed2200c0 1237879200 C Bi:003:01 -32 0
> > ed2541c0 1237880107 S Co:003:00 s 02 01 0000 0081 0000 0
> > ed2541c0 1237880279 C Co:003:00 0 0
> > ed2541c0 1237880387 S Bi:003:01 -115 13 <
> > ed2541c0 1237880524 C Bi:003:01 -75 0
> 
> And now the MODE SENSE is repeated, and again there's an overflow error.
> There's a lot more, but I'm not going to look at it now.  There's 
> plenty of stuff to fix just in the portion above.

right, I'll go fix those up. Thanks

-- 
balbi

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux