On Tue, Nov 20, 2012 at 03:40:44PM -0500, Alan Stern wrote: > On Tue, 20 Nov 2012, Piergiorgio Sartor wrote: > > > This is the output of usbmod just when the problem happened: > > > > ffff8801232fd6c0 815566692 C Co:1:012:0 -2 0 > > ffff880130cd9000 815566767 S Co:1:012:0 s 23 03 0004 0001 0000 0 > > ffff880130cd9000 816577053 C Co:1:012:0 -2 0 > ... > > ffff880026fb2780 845911126 S Co:1:013:0 s 23 03 0004 0002 0000 0 > > ffff880026fb2780 846922109 C Co:1:013:0 -2 0 > > All this shows is that the computer isn't able to communicate with two > of your hubs. > > > After dmesg reported the error -110 I stopped it. > > > > Interesting enough, with usbmod running (cat /sys/.../1u > usb1.txt) > > the transfer rate, from the HDDs, was around about 10% slower than > > usual and the problem did not show up at first. > > No doubt that was caused by the overhead of using usbmon. > > > I do not know if it was just by random, but this could hint a race > > condition somewhere. > > > > Any idea? > > I need to see more of the context. What shows up in the usbmon trace > starting somewhat _before_ the problem happens? OK, I got a log with working and then non working system, so there should be a transition, Problem is, the file is 1.2MB, bizp2 reduces it to 200K. How do I pass it to you? Furthermore, this time it was quite hard to get the error. The system was working several minutes before it happened. I noticed, maybe unrelated, that when working, the CPU clock was at max (2.6GHz), just before stopping to work, the CPU clock was at min (800MHz). Another, possibly related item, in this test run, the transfer speed was always at max, monitoring the USB was not slowing it down as seen before (same kernel). I might think that the clock at max speed was responsible. > Also, after the problem occurs, you should go into the > /sys/kernel/debug/usb/ehci/ directory and find the subdirectory > corresponding to the controller for bus 1. Let's see what the files in > that subdirectory say. Of the 4 files I found there, 2 were empty, the others were "periodic" and "registers", with following content: size = 512 1: qh256-0001/ffff88014a55a5a0 (h4 ep1in [1/0] q1 p1) 257: qh256-0001/ffff88014a55a5a0 bus pci, device 0000:00:0b.1 EHCI Host Controller EHCI 1.00, rh state running ownership 00000001 SMI sts/enable 0xc0080000 structural params 0x00101888 capability params 0x0000a086 status c008 Async Periodic FLR command 0010015 (park)=0 ithresh=1 Periodic period=512 RUN intrenable 37 IAA FATAL PCD ERR INT uframe 0deb port:1 status 003400 0 ACK POWER OWNER sig=k port:2 status 003400 0 ACK POWER OWNER sig=k port:3 status 001005 0 ACK POWER sig=se0 PE CONNECT port:4 status 001000 0 ACK POWER sig=se0 port:5 status 001000 0 ACK POWER sig=se0 port:6 status 001000 0 ACK POWER sig=se0 port:7 status 003400 0 ACK POWER OWNER sig=k port:8 status 003400 0 ACK POWER OWNER sig=k irq normal 523806 err 103 iaa 29127 (lost 0) complete 1188957 unlink 51 There were both taken after the error and after disconnecting the HUBs, I'm not sure if there're meaningful to you. In case not, please let me know what should be the exact procedure you need. Thanks, bye, -- piergiorgio -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html