Re: [PATCH] xHCI: fix bug in xhci_clear_command_ring()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Great!  I'll send Greg a pull request with this patch, and a patch to
revert Andiry's reset resume patch.

Sarah Sharp

On Thu, Dec 01, 2011 at 01:57:51PM +0100, Julian Sikorski wrote:
> Hi,
> 
> OK, this one port dead story seems like an isolated, unrelated incident.
> I have been suspending and resuming the machine many times, plugging the
> drive into both ports and the controller seems rock solid with this new
> patch. The fact that I was able to suspend every time just re-confirms
> that it keeps responding. As usual, /var/log/messages is attached.
> 
> Julian
> 
> W dniu 01.12.2011 02:14, Julian Sikorski pisze:
> > I am having a mixed answer. Here is what I did:
> > 
> > I plugged the drive in
> > -disconnected it
> > - suspended/resumed
> > - reconnected
> > - used it for 90 minutes
> > Everything was fine, which seems better than an unpatched kernel case. I
> > then continued:
> > - suspended it with the drive connected (around 01:51:52)
> > - resumed, the drive still worked
> > Unfortunalely, the second port stopped responding (01:57:05). Another
> > one or two suspend-resume cycles did not bring it back to life, but the
> > first port was still working fine.
> > I am not sure if this is not a different problem, since normally after a
> > failure the system would not suspend at all. This time one port just
> > seem to be acting out. Oddly enough, nothing was ever connected to it
> > during this session. I will keep testing since something might
> > definitely be going on (it is definitely more stable, but let's hold on
> > with the final call).
> > In the meantime, please have a look at /var/log/messages, maybe there is
> > something interesting in it.
> > 
> > Regards,
> > Julian
> > 
> > 
> > W dniu 30.11.2011 19:29, Sarah Sharp pisze:
> >> Good catch!
> >>
> >> Is there any chance that Julian's instability after system resume is
> >> related to this bug?  If you forced a reset resume, the xHCI driver
> >> would have reallocated the command ring with a proper link TRB.  Without
> >> the reset resume, the zeroed command ring wouldn't have a link TRB and
> >> the host controller would have eventually walked off the end of the
> >> command ring.  That might explain why the host controller stopped
> >> responding to the stop endpoint command without the reset resume, but
> >> only after a very long time (half an hour).
> >>
> >> Julian, can you revert Andiry's patch to add the reset resume, add this
> >> patch instead, and see if it fixes your instability issues?  If so, I
> >> think this is a better fix.
> >>
> >> Sarah Sharp
> >>
> >> On Wed, Nov 30, 2011 at 04:37:41PM +0800, Andiry Xu wrote:
> >>> When system enters suspend, xHCI driver clears command ring by writing zero
> >>> to all the TRBs. However, this also writes zero to the Link TRB, and the ring
> >>> is mangled. This may cause driver accesses wrong memory address and the
> >>> result is unpredicted.
> >>>
> >>> When clear the command ring, keep the last Link TRB intact, only clear its
> >>> cycle bit. This should fix the "command ring full" issue reported by Oliver
> >>> Neukum.
> >>>
> >>> This should be backported to stable kernels as old as 2.6.37, since the
> >>> commit 89821320 "xhci: Fix command ring replay after resume" is merged.
> >>>
> >>> Signed-off-by: Andiry Xu <andiry.xu@xxxxxxx>
> >>> ---
> >>>  drivers/usb/host/xhci.c |    5 ++++-
> >>>  1 files changed, 4 insertions(+), 1 deletions(-)
> >>>
> >>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> >>> index aa94c01..a1afb7c 100644
> >>> --- a/drivers/usb/host/xhci.c
> >>> +++ b/drivers/usb/host/xhci.c
> >>> @@ -711,7 +711,10 @@ static void xhci_clear_command_ring(struct xhci_hcd *xhci)
> >>>  	ring = xhci->cmd_ring;
> >>>  	seg = ring->deq_seg;
> >>>  	do {
> >>> -		memset(seg->trbs, 0, SEGMENT_SIZE);
> >>> +		memset(seg->trbs, 0,
> >>> +			sizeof(union xhci_trb) * (TRBS_PER_SEGMENT - 1));
> >>> +		seg->trbs[TRBS_PER_SEGMENT - 1].link.control &=
> >>> +			cpu_to_le32(~TRB_CYCLE);
> >>>  		seg = seg->next;
> >>>  	} while (seg != ring->deq_seg);
> >>>  
> >>> -- 
> >>> 1.7.4.1
> >>>
> >>>
> > 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux