Great! I'll send Greg a pull request with this patch, and a patch to revert Andiry's reset resume patch. Sarah Sharp On Thu, Dec 01, 2011 at 01:57:51PM +0100, Julian Sikorski wrote: > Hi, > > OK, this one port dead story seems like an isolated, unrelated incident. > I have been suspending and resuming the machine many times, plugging the > drive into both ports and the controller seems rock solid with this new > patch. The fact that I was able to suspend every time just re-confirms > that it keeps responding. As usual, /var/log/messages is attached. > > Julian > > W dniu 01.12.2011 02:14, Julian Sikorski pisze: > > I am having a mixed answer. Here is what I did: > > > > I plugged the drive in > > -disconnected it > > - suspended/resumed > > - reconnected > > - used it for 90 minutes > > Everything was fine, which seems better than an unpatched kernel case. I > > then continued: > > - suspended it with the drive connected (around 01:51:52) > > - resumed, the drive still worked > > Unfortunalely, the second port stopped responding (01:57:05). Another > > one or two suspend-resume cycles did not bring it back to life, but the > > first port was still working fine. > > I am not sure if this is not a different problem, since normally after a > > failure the system would not suspend at all. This time one port just > > seem to be acting out. Oddly enough, nothing was ever connected to it > > during this session. I will keep testing since something might > > definitely be going on (it is definitely more stable, but let's hold on > > with the final call). > > In the meantime, please have a look at /var/log/messages, maybe there is > > something interesting in it. > > > > Regards, > > Julian > > > > > > W dniu 30.11.2011 19:29, Sarah Sharp pisze: > >> Good catch! > >> > >> Is there any chance that Julian's instability after system resume is > >> related to this bug? If you forced a reset resume, the xHCI driver > >> would have reallocated the command ring with a proper link TRB. Without > >> the reset resume, the zeroed command ring wouldn't have a link TRB and > >> the host controller would have eventually walked off the end of the > >> command ring. That might explain why the host controller stopped > >> responding to the stop endpoint command without the reset resume, but > >> only after a very long time (half an hour). > >> > >> Julian, can you revert Andiry's patch to add the reset resume, add this > >> patch instead, and see if it fixes your instability issues? If so, I > >> think this is a better fix. > >> > >> Sarah Sharp > >> > >> On Wed, Nov 30, 2011 at 04:37:41PM +0800, Andiry Xu wrote: > >>> When system enters suspend, xHCI driver clears command ring by writing zero > >>> to all the TRBs. However, this also writes zero to the Link TRB, and the ring > >>> is mangled. This may cause driver accesses wrong memory address and the > >>> result is unpredicted. > >>> > >>> When clear the command ring, keep the last Link TRB intact, only clear its > >>> cycle bit. This should fix the "command ring full" issue reported by Oliver > >>> Neukum. > >>> > >>> This should be backported to stable kernels as old as 2.6.37, since the > >>> commit 89821320 "xhci: Fix command ring replay after resume" is merged. > >>> > >>> Signed-off-by: Andiry Xu <andiry.xu@xxxxxxx> > >>> --- > >>> drivers/usb/host/xhci.c | 5 ++++- > >>> 1 files changed, 4 insertions(+), 1 deletions(-) > >>> > >>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c > >>> index aa94c01..a1afb7c 100644 > >>> --- a/drivers/usb/host/xhci.c > >>> +++ b/drivers/usb/host/xhci.c > >>> @@ -711,7 +711,10 @@ static void xhci_clear_command_ring(struct xhci_hcd *xhci) > >>> ring = xhci->cmd_ring; > >>> seg = ring->deq_seg; > >>> do { > >>> - memset(seg->trbs, 0, SEGMENT_SIZE); > >>> + memset(seg->trbs, 0, > >>> + sizeof(union xhci_trb) * (TRBS_PER_SEGMENT - 1)); > >>> + seg->trbs[TRBS_PER_SEGMENT - 1].link.control &= > >>> + cpu_to_le32(~TRB_CYCLE); > >>> seg = seg->next; > >>> } while (seg != ring->deq_seg); > >>> > >>> -- > >>> 1.7.4.1 > >>> > >>> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html