On Sat, 15 Dec 2012, Piergiorgio Sartor wrote: > Hi Alan, > > thanks for the patch, I applied it and tested. > > I've bad news... > Unless I made some mistake (which can always be), I did not > see any of the logs of this patch in the output, despite > having seen the problem occuring (this time I used "cpuburn" > to overload the system). That's disappointing. > The only reasonable log was like the previous one: > > [ 248.833978] ehci_hcd 0000:00:0b.1: alan start cur time 4294916032 last scan 4294885464 > [ 248.833991] ehci_hcd 0000:00:0b.1: command 10035 cmd reg 10075 io watchdog 1 async count 10 > [ 248.833998] ehci_hcd 0000:00:0b.1: async list: > [ 248.834006] ehci_hcd 0000:00:0b.1: qh ffff880149fcfe00 hw ffff8800af800000 dma af800000 next ffff880146d70380 hw_next af8009c2 > [ 248.834013] ehci_hcd 0000:00:0b.1: qh ffff880146d70380 hw ffff8800af8009c0 dma af8009c0 next ffff880146c95b00 hw_next af800a82 > [ 248.834020] ehci_hcd 0000:00:0b.1: qh ffff880146c95b00 hw ffff8800af800a80 dma af800a80 next ffff880149b51f00 hw_next af800cc2 > [ 248.834027] ehci_hcd 0000:00:0b.1: qh ffff880149b51f00 hw ffff8800af800cc0 dma af800cc0 next ffff880145d17e80 hw_next af800f02 > [ 248.834033] ehci_hcd 0000:00:0b.1: qh ffff880145d17e80 hw ffff8800af800f00 dma af800f00 next ffff88013d137180 hw_next af800c02 > [ 248.834040] ehci_hcd 0000:00:0b.1: qh ffff88013d137180 hw ffff8800af800c00 dma af800c00 next ffff880148bc0000 hw_next af800d82 > [ 248.834046] ehci_hcd 0000:00:0b.1: qh ffff880148bc0000 hw ffff8800af800d80 dma af800d80 next ffff8801410f9f00 hw_next af800b42 > [ 248.834053] ehci_hcd 0000:00:0b.1: qh ffff8801410f9f00 hw ffff8800af800b40 dma af800b40 next ffff880147b8bc80 hw_next af8007e2 > [ 248.834060] ehci_hcd 0000:00:0b.1: qh ffff880147b8bc80 hw ffff8800af8007e0 dma af8007e0 next ffff880140b49180 hw_next af800e42 > [ 248.834066] ehci_hcd 0000:00:0b.1: qh ffff880140b49180 hw ffff8800af800e40 dma af800e40 next ffff880141f8ac00 hw_next af8008a2 > [ 248.834073] ehci_hcd 0000:00:0b.1: qh ffff880141f8ac00 hw ffff8800af8008a0 dma af8008a0 next (null) hw_next af800002 > [ 248.834080] ehci_hcd 0000:00:0b.1: This qh link time 4294885466 enqueue time 4294885466 td token 1f8c80 ov token 0 > [ 248.834087] ehci_hcd 0000:00:0b.1: qh ffff880146d70380 naf800a82 info 42002210 40000000 qtd af81ccc0 > [ 248.834095] ehci_hcd 0000:00:0b.1: overlay td ffff8800af8009d0 nae2531e0 00000001 t00000000 p0=af87fa9f > [ 248.834103] ehci_hcd 0000:00:0b.1: dma ae2531e0 td ffff8800ae2531e0 naf81ef60 00000001 t001f8c80 p0=af87fa80 > [ 248.834111] ehci_hcd 0000:00:0b.1: dummy af81ef60 td ffff8800af81ef60 n00000001 00000001 t00000040 p0=00000000 > [ 248.845154] ehci_hcd 0000:00:0b.1: giveback urb ffff88014980dcc0 actual 0 > [ 248.845162] ehci_hcd 0000:00:0b.1: alan end This says that everything is correct, but the async schedule isn't running even though it is supposedly turned on. Goodness knows why not. > Note that there was not anymore the log line before "alan start". > > I suspect the previous error was something else, maybe something > it happens seldom, since sometimes I saw some timeout errors, but > they were almost always not a problem. Maybe. We need more debugging. Two changes: Firstly, in the code added by the earlier, larger patch, edit the alan_debug() function in ehci-hcd.c. Change the lines that say: ehci_dbg(ehci, "command %x cmd reg %x io watchdog %d async count %d\n", ehci->command, ehci_readl(ehci, &ehci->regs->command), ehci->need_io_watchdog, ehci->async_count); to this instead: ehci_dbg(ehci, "command %x cmd reg %x status reg %x\n", ehci->command, ehci_readl(ehci, &ehci->regs->command), ehci_readl(ehci, &ehci->regs->status)); Secondly, apply the patch below on top of the last patch. It will tell us when the async schedule gets turned on and off. Alan Stern Index: usb-3.7/drivers/usb/host/ehci-timer.c =================================================================== --- usb-3.7.orig/drivers/usb/host/ehci-timer.c +++ usb-3.7/drivers/usb/host/ehci-timer.c @@ -126,9 +126,10 @@ static void ehci_poll_ASS(struct ehci_hc /* The status is up-to-date; restart or stop the schedule as needed */ if (want == 0) { /* Stopped */ - if (ehci->async_count > 0) + if (ehci->async_count > 0) { ehci_set_command_bit(ehci, CMD_ASE); - + ehci_dbg(ehci, "async on\n"); + } } else { /* Running */ if (ehci->async_count == 0) { @@ -143,6 +144,7 @@ static void ehci_poll_ASS(struct ehci_hc static void ehci_disable_ASE(struct ehci_hcd *ehci) { ehci_clear_command_bit(ehci, CMD_ASE); + ehci_dbg(ehci, "async off\n"); } -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html