Re: USB issue with kernel 3.6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alan,

I applied the modifications you suggested, this is the output:

[  110.922009] ehci_hcd 0000:00:0b.1: async off
[  209.362134] ehci_hcd 0000:00:0b.1: async on
[  241.794760] ehci_hcd 0000:00:0b.1: alan start  cur time 4294908992 last scan 4294878606
[  241.794774] ehci_hcd 0000:00:0b.1: command 10035 cmd reg 10035 status reg e008
[  241.794779] ehci_hcd 0000:00:0b.1: async list:
[  241.794787] ehci_hcd 0000:00:0b.1:  qh ffff880149fe7800 hw ffff8800af800000 dma af800000 next ffff880146ce3f80 hw_next af800c62
[  241.794795] ehci_hcd 0000:00:0b.1:  qh ffff880146ce3f80 hw ffff8800af800c60 dma af800c60 next ffff880146ce3780 hw_next af800d22
[  241.794801] ehci_hcd 0000:00:0b.1:  qh ffff880146ce3780 hw ffff8800af800d20 dma af800d20 next ffff880147776680 hw_next af800ae2
[  241.794808] ehci_hcd 0000:00:0b.1:  qh ffff880147776680 hw ffff8800af800ae0 dma af800ae0 next ffff8801478bf380 hw_next af800a22
[  241.794815] ehci_hcd 0000:00:0b.1:  qh ffff8801478bf380 hw ffff8800af800a20 dma af800a20 next ffff88014070da00 hw_next af800f02
[  241.794821] ehci_hcd 0000:00:0b.1:  qh ffff88014070da00 hw ffff8800af800f00 dma af800f00 next ffff8801407d3800 hw_next af800de2
[  241.794828] ehci_hcd 0000:00:0b.1:  qh ffff8801407d3800 hw ffff8800af800de0 dma af800de0 next ffff8801478bfc80 hw_next af800782
[  241.794834] ehci_hcd 0000:00:0b.1:  qh ffff8801478bfc80 hw ffff8800af800780 dma af800780 next ffff88014079e480 hw_next af800ba2
[  241.794841] ehci_hcd 0000:00:0b.1:  qh ffff88014079e480 hw ffff8800af800ba0 dma af800ba0 next ffff8801478bf780 hw_next af8008a2
[  241.794847] ehci_hcd 0000:00:0b.1:  qh ffff8801478bf780 hw ffff8800af8008a0 dma af8008a0 next ffff8801478bfd00 hw_next af800962
[  241.794854] ehci_hcd 0000:00:0b.1:  qh ffff8801478bfd00 hw ffff8800af800960 dma af800960 next           (null) hw_next af800002
[  241.794861] ehci_hcd 0000:00:0b.1: This qh link time 4294878614 enqueue time 4294878614 td token 1f8c80 ov token 0
[  241.794869] ehci_hcd 0000:00:0b.1:   qh ffff8801478bfd00 naf800002 info 4200220e 40000000 qtd af820540
[  241.794876] ehci_hcd 0000:00:0b.1: overlay td ffff8800af800970 naf820d80 00000001 t00000000 p0=af883a9f
[  241.794884] ehci_hcd 0000:00:0b.1:   dma af820d80 td ffff8800af820d80 naf820960 00000001 t001f8c80 p0=af883a80
[  241.794892] ehci_hcd 0000:00:0b.1:   dummy af820960 td ffff8800af820960 n00000001 00000001 t00000040 p0=00000000
[  241.805935] ehci_hcd 0000:00:0b.1: giveback urb ffff8801455fa300 actual 0
[  241.805943] ehci_hcd 0000:00:0b.1: alan end

The first "async off" happened, I guess correctly,
when the hot-plug settled and all the HDDs where
in place as expected.
The "async on" happened as soon as I started to
read 10 HDDs in parallel, I think correctly too.

After few seconds, this time without any "help",
the problem appeared with the log as reported.

I noticed several "async on/off" in dmesg, it
seems to me all were consistent.

bye,

pg

On Sat, Dec 15, 2012 at 03:53:09PM -0500, Alan Stern wrote:
> On Sat, 15 Dec 2012, Piergiorgio Sartor wrote:
> 
> > Hi Alan,
> > 
> > thanks for the patch, I applied it and tested.
> > 
> > I've bad news...
> > Unless I made some mistake (which can always be), I did not
> > see any of the logs of this patch in the output, despite
> > having seen the problem occuring (this time I used "cpuburn"
> > to overload the system).
> 
> That's disappointing.
> 
> > The only reasonable log was like the previous one:
> > 
> > [  248.833978] ehci_hcd 0000:00:0b.1: alan start  cur time 4294916032 last scan 4294885464
> > [  248.833991] ehci_hcd 0000:00:0b.1: command 10035 cmd reg 10075 io watchdog 1 async count 10
> > [  248.833998] ehci_hcd 0000:00:0b.1: async list:
> > [  248.834006] ehci_hcd 0000:00:0b.1:  qh ffff880149fcfe00 hw ffff8800af800000 dma af800000 next ffff880146d70380 hw_next af8009c2
> > [  248.834013] ehci_hcd 0000:00:0b.1:  qh ffff880146d70380 hw ffff8800af8009c0 dma af8009c0 next ffff880146c95b00 hw_next af800a82
> > [  248.834020] ehci_hcd 0000:00:0b.1:  qh ffff880146c95b00 hw ffff8800af800a80 dma af800a80 next ffff880149b51f00 hw_next af800cc2
> > [  248.834027] ehci_hcd 0000:00:0b.1:  qh ffff880149b51f00 hw ffff8800af800cc0 dma af800cc0 next ffff880145d17e80 hw_next af800f02
> > [  248.834033] ehci_hcd 0000:00:0b.1:  qh ffff880145d17e80 hw ffff8800af800f00 dma af800f00 next ffff88013d137180 hw_next af800c02
> > [  248.834040] ehci_hcd 0000:00:0b.1:  qh ffff88013d137180 hw ffff8800af800c00 dma af800c00 next ffff880148bc0000 hw_next af800d82
> > [  248.834046] ehci_hcd 0000:00:0b.1:  qh ffff880148bc0000 hw ffff8800af800d80 dma af800d80 next ffff8801410f9f00 hw_next af800b42
> > [  248.834053] ehci_hcd 0000:00:0b.1:  qh ffff8801410f9f00 hw ffff8800af800b40 dma af800b40 next ffff880147b8bc80 hw_next af8007e2
> > [  248.834060] ehci_hcd 0000:00:0b.1:  qh ffff880147b8bc80 hw ffff8800af8007e0 dma af8007e0 next ffff880140b49180 hw_next af800e42
> > [  248.834066] ehci_hcd 0000:00:0b.1:  qh ffff880140b49180 hw ffff8800af800e40 dma af800e40 next ffff880141f8ac00 hw_next af8008a2
> > [  248.834073] ehci_hcd 0000:00:0b.1:  qh ffff880141f8ac00 hw ffff8800af8008a0 dma af8008a0 next           (null) hw_next af800002
> > [  248.834080] ehci_hcd 0000:00:0b.1: This qh link time 4294885466 enqueue time 4294885466 td token 1f8c80 ov token 0
> > [  248.834087] ehci_hcd 0000:00:0b.1:   qh ffff880146d70380 naf800a82 info 42002210 40000000 qtd af81ccc0
> > [  248.834095] ehci_hcd 0000:00:0b.1: overlay td ffff8800af8009d0 nae2531e0 00000001 t00000000 p0=af87fa9f
> > [  248.834103] ehci_hcd 0000:00:0b.1:   dma ae2531e0 td ffff8800ae2531e0 naf81ef60 00000001 t001f8c80 p0=af87fa80
> > [  248.834111] ehci_hcd 0000:00:0b.1:   dummy af81ef60 td ffff8800af81ef60 n00000001 00000001 t00000040 p0=00000000
> > [  248.845154] ehci_hcd 0000:00:0b.1: giveback urb ffff88014980dcc0 actual 0
> > [  248.845162] ehci_hcd 0000:00:0b.1: alan end
> 
> This says that everything is correct, but the async schedule isn't 
> running even though it is supposedly turned on.  Goodness knows why 
> not.
> 
> > Note that there was not anymore the log line before "alan start".
> > 
> > I suspect the previous error was something else, maybe something
> > it happens seldom, since sometimes I saw some timeout errors, but
> > they were almost always not a problem.
> 
> Maybe.  We need more debugging.  Two changes:
> 
> Firstly, in the code added by the earlier, larger patch, edit the 
> alan_debug() function in ehci-hcd.c.  Change the lines that say:
> 
> 	ehci_dbg(ehci, "command %x cmd reg %x io watchdog %d async count %d\n",
> 		ehci->command, ehci_readl(ehci, &ehci->regs->command),
> 		ehci->need_io_watchdog, ehci->async_count);
> 
> to this instead:
> 
> 	ehci_dbg(ehci, "command %x cmd reg %x status reg %x\n",
> 		ehci->command, ehci_readl(ehci, &ehci->regs->command),
> 		ehci_readl(ehci, &ehci->regs->status));
> 
> Secondly, apply the patch below on top of the last patch.  It will 
> tell us when the async schedule gets turned on and off.
> 
> Alan Stern
> 
> 
> 
> Index: usb-3.7/drivers/usb/host/ehci-timer.c
> ===================================================================
> --- usb-3.7.orig/drivers/usb/host/ehci-timer.c
> +++ usb-3.7/drivers/usb/host/ehci-timer.c
> @@ -126,9 +126,10 @@ static void ehci_poll_ASS(struct ehci_hc
>  
>  	/* The status is up-to-date; restart or stop the schedule as needed */
>  	if (want == 0) {	/* Stopped */
> -		if (ehci->async_count > 0)
> +		if (ehci->async_count > 0) {
>  			ehci_set_command_bit(ehci, CMD_ASE);
> -
> +			ehci_dbg(ehci, "async on\n");
> +		}
>  	} else {		/* Running */
>  		if (ehci->async_count == 0) {
>  
> @@ -143,6 +144,7 @@ static void ehci_poll_ASS(struct ehci_hc
>  static void ehci_disable_ASE(struct ehci_hcd *ehci)
>  {
>  	ehci_clear_command_bit(ehci, CMD_ASE);
> +	ehci_dbg(ehci, "async off\n");
>  }
>  
>  

-- 

piergiorgio
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux