On 9/1/20 1:51 PM, Alan Stern wrote: > On Tue, Sep 01, 2020 at 11:00:16AM -0600, Khalid Aziz wrote: >> On 9/1/20 10:36 AM, Alan Stern wrote: >>> On Tue, Sep 01, 2020 at 09:15:46AM -0700, Khalid Aziz wrote: >>>> On 8/31/20 8:31 PM, Alan Stern wrote: >>>>> Can you collect a usbmon trace showing an example of this problem? >>>>> >>>> >>>> I have attached usbmon traces for when USB hub with keyboards and mouse >>>> is plugged into USB 2.0 port and when it is plugged into the NEC USB 3.0 >>>> port. >>> >>> The usbmon traces show lots of errors, but no Clear-TT events. The >>> large number of errors suggests that you've got a hardware problem; >>> either a bad hub or bad USB connections. >> >> That is what I thought initially which is why I got additional hubs and >> a USB 2.0 PCI card to test. I am seeing errors across 3 USB controllers, >> 4 USB hubs and 4 slow/full speed devices. All of the hubs and slow/full >> devices work with zero errors on my laptop. My keyboard/mouse devices >> and 2 of my USB hubs predate motherboard update and they all worked >> flawlessly before the motherboard upgrade. Some combinations of these >> also works with no errors on my desktop with new motherboard that I had >> listed in my original email: > > It's a very puzzling situation. > > One thing which probably would work well, surprisingly, would be to buy > an old USB-1.1 hub and plug it into the PCI card. That combination is > likely to be similar to what you see when plugging the devices directly > into the PCI card. It might even work okay with the USB-3 controllers. > >> 2. USB 2.0 controller - WORKS >> 5. USB 3.0/3.1 controller -> Bus powered USB 2.0 hub - WORKS >> >> I am not seeing a common failure here that would point to any specific >> hardware being bad. Besides, that one code change (which I still can't >> say is the right code change) in ehci-q.c makes USB 2.0 controller work >> reliably with all my devices. > > The USB and EHCI designs are flawed in that under the circumstances > you're seeing, they don't have any way to tell the difference between a > STALL and a host timing error. The current code treats these situations > as timing/transmission errors (resulting in device resets); your change > causes them to be treated as STALLs. However, there are known, common > situations in which those same symptoms really are caused by > transmission errors, so we don't want to start treating them as STALLs. > > Besides, I suspect that your code change does _not_ make the USB-2 > controller work reliably with your devices. You should collect a usbmon > trace under those conditions; I predict it will be full of STALLs. And > furthermore, I believe these STALLs will not show up in a usbmon trace > made with the devices plugged directly into the PCI card. If I'm right > about these things, the errors are still present even with your patch; > all it does is hide them. > > Short of a USB bus analyzer, however, there's no way to tell what's > really going on. I have managed to find a hardware combination that seems to work, so for now at least my machine is usable. I will figure out how to interpret usbmon output and run more experiments. There seems to be a real problem in the driver somewhere and should be solved. Thanks, Khalid