Correction, this regression/commit (55bcdce) was introduced between rc6 and rc7. Sorry about that. 2013/2/18 Ronald <ronald645@xxxxxxxxx>: > CC'ing the author of the patch. > > 2013/2/18 Ronald <ronald645@xxxxxxxxx>: >>>>> This e-mail is a follow-up as requested in this bug[1]. I will repost >>>>> everything so far in this e-mail. Please CC me as I'm not subscribed >>>>> to your list. >>>>> >>>>> Current head gives this when I plug a 'Mass Storage Device' into a 2.0 hub: >>>>> >>>>> [ 842.760400] hub 1-0:1.0: unable to enumerate USB device on port 3 >>>>> [ 843.080058] usb 1-3: new high-speed USB device number 48 using ehci-pci >>>>> [ 858.230072] usb 1-3: device descriptor read/64, error -110 >>>>> [ 873.490070] usb 1-3: device descriptor read/64, error -110 >>>>> >>>>> Reverting the following commit makes it work again: >>>>> >>>>> commit 55bcdce8a8228223ec4d17d8ded8134ed265d2c5 >>>>> Author: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> >>>>> Date: Fri Jan 25 16:52:45 2013 -0500 >>>>> >>>>> USB: EHCI: remove ASS/PSS polling timeout >>>>> >>>>> This patch (as1647) attempts to work around a problem that seems to >>>>> affect some nVidia EHCI controllers. They sometimes take a very long >>>>> time to turn off their async or periodic schedules. I don't know if >>>>> this is a result of other problems, but in any case it seems wise not >>>>> to depend on schedule enables or disables taking effect in any >>>>> specific length of time. >>>>> >>>>> The patch removes the existing 20-ms timeout for enabling and >>>>> disabling the schedules. The driver will now continue to poll the >>>>> schedule state at 1-ms intervals until the controller finally decides >>>>> to obey the most recent command issued by the driver. Just in case >>>>> this hides a problem, a debugging message will be logged if the >>>>> controller takes longer than 20 polls. >>>>> >>>>> I don't know if this will actually fix anything, but it can't hurt. >>>>> >>>>> Signed-off-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> >>>>> Tested-by: Piergiorgio Sartor <piergiorgio.sartor@xxxxxxxx> >>>>> CC: <stable@xxxxxxxxxxxxxxx> >>>>> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> >>>>> >>>>> diff --git a/drivers/usb/host/ehci-timer.c b/drivers/usb/host/ehci-timer.c >>>>> index 20dbdcb..f904071 100644 >>>>> --- a/drivers/usb/host/ehci-timer.c >>>>> +++ b/drivers/usb/host/ehci-timer.c >>>>> @@ -113,14 +113,15 @@ static void ehci_poll_ASS(struct ehci_hcd *ehci) >>>>> >>>>> if (want != actual) { >>>>> >>>>> - /* Poll again later, but give up after about 20 ms */ >>>>> - if (ehci->ASS_poll_count++ < 20) { >>>>> - ehci_enable_event(ehci, EHCI_HRTIMER_POLL_ASS, true); >>>>> - return; >>>>> - } >>>>> - ehci_dbg(ehci, "Waited too long for the async schedule >>>>> status(%x/%x), giving up\n", >>>>> - want, actual); >>>>> + /* Poll again later */ >>>>> + ehci_enable_event(ehci, EHCI_HRTIMER_POLL_ASS, true); >>>>> + ++ehci->ASS_poll_count; >>>>> + return; >>>>> } >>>>> + >>>>> + if (ehci->PSS_poll_count > 20) >>>>> + ehci_dbg(ehci, "PSS poll count reached %d\n", >>>>> + ehci->PSS_poll_count); >>>>> ehci->PSS_poll_count = 0; >>>>> >>>>> /* The status is up-to-date; restart or stop the schedule as needed */ >>>>> >>>>> Please note, that I'm using the 'irqpoll' cmdline to improve system >>>>> stability. What I forgot to mention in the bug was the chipset: >>>>> >>>>> 00:10.3 USB controller: VIA Technologies, Inc. USB 2.0 (rev 82) >>>>> (prog-if 20 [EHCI]) >>>>> Subsystem: Micro-Star International Co., Ltd. KT4AV motherboard >>>>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- >>>>> Stepping- SERR- FastB2B- DisINTx- >>>>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- >>>>> <TAbort- <MAbort- >SERR- <PERR- INTx- >>>>> Latency: 32, Cache Line Size: 32 bytes >>>>> Interrupt: pin D routed to IRQ 21 >>>>> Region 0: Memory at dffeff00 (32-bit, non-prefetchable) [size=256] >>>>> Capabilities: [80] Power Management version 2 >>>>> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) >>>>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- >>>>> Kernel driver in use: ehci-pci >>>>> >>>>> Yes, it's 10 years old, and no I'm not scrooge. We are waiting a while >>>>> for computer prices to plummet mkay? >>>>> >>>>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=54031 >>>> >>>> Would like to add that while searching the mailing lists I stumbled on this: >>>> >>>> http://marc.info/?l=linux-usb&m=136045531311402&w=4 >>>> >>>> It's an entirely seperate USB issue in this cycle. The person is doing >>>> git bisect to find the regression. I did another approach as >>>> recompiling the full kernel on a 1,25GHz isn't even remotely funny >>>> anymore. >>>> >>>> On top of HEAD I started reverting groups of USB EHCI patches one by >>>> one. I'm just mentioning it since I'm not sure if this procedure is >>>> accepted here. >>> >>> Did some more testing this morning. It seems like it's a race >>> condition, which somewhat confirms that this patch is involved. Just >>> had an occurance where the kernel with this patch *not* reverted >>> handled the USB just fine. But subsequent attempts failed like this: >>> >>> # attempt 2 >>> [ 382.370377] hub 1-0:1.0: unable to enumerate USB device on port 3 >>> [ 382.690046] usb 1-3: new high-speed USB device number 7 using ehci-pci >>> [ 397.840031] usb 1-3: device descriptor read/64, error -110 >>> # attempt 3 >>> [ 413.040329] hub 1-0:1.0: unable to enumerate USB device on port 3 >>> [ 413.360069] usb 1-3: new high-speed USB device number 8 using ehci-pci >>> [ 428.510049] usb 1-3: device descriptor read/64, error -110 >>> >>> Please note the ~15 second time-out between detection and error. This >>> also explains what made ordinary bisect somewhat 'tedious'... . Kernel >>> with this patch reverted handles the USB solidly everytime and on >>> every subsequent occurance so far. >>> >>> The 'bad' kernel handles the USB correctly every once in a while. But >>> subsequent occurances always failed so far under these conditions. >> >> One final observation, sometimes the kernel without the patch reverted >> rejects the usb-stick right away (i.e. no initial onetime success). >> Dmesg then looks like this: >> >> [ 146.980077] usb 1-3: new high-speed USB device number 4 using ehci-pci >> [ 152.140713] usb 1-3: device descriptor read/all, error -110 >> [ 152.200338] hub 1-0:1.0: unable to enumerate USB device on port 3 >> [ 152.520050] usb 1-3: new high-speed USB device number 6 using ehci-pci >> [ 167.670035] usb 1-3: device descriptor read/64, error -110 >> [ 182.930046] usb 1-3: device descriptor read/64, error -110 >> [ 183.100306] hub 1-0:1.0: unable to enumerate USB device on port 3 >> [ 183.420047] usb 1-3: new high-speed USB device number 8 using ehci-pci >> [ 198.570063] usb 1-3: device descriptor read/64, error -110 >> [ 213.830047] usb 1-3: device descriptor read/64, error -110 >> [ 214.000248] hub 1-0:1.0: unable to enumerate USB device on port 3 >> # above 4 lines repeat until stick get's unplugged >> >> Notice the first error to be related to 'read/all' and subsequent >> errors are related to 'read/64'. Kernel with patch reverted still >> works without a hitch so far. >> >> Conclusion: So I'm pretty confident this patch is the 'first bad commit'. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html