Re: Increasing TRBS_PER_SEGMENT causes DMAR/PTE faults

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 07, 2015 at 05:39:55PM +0300, Mathias Nyman wrote:
> 
> On 07.08.2015 15:40, linux-usb@xxxxxxxxxxx wrote:
> > Hi there,
> 
> Hi

Hi again (a few months later, when I finally found some time)


> > Moving from 4.0.4 to 4.0.5, my USB 3.0 controller became practically useless
> > (and this same issue persists [at least] upto and including 4.1.3).
> > Here's an excerpt from my syslog demonstrating the problem:
> > 
> ...
> > 
> > Looking in ChangeLog-4.0.5 for xhci-related changes gave me 3 hits:
> > 
> > (1)  xhci: gracefully handle xhci_irq dead device
> >      commit 948fa13504f80b9765d2b753691ab94c83a10341 upstream.
> > 
> > (2)  xhci: Solve full event ring by increasing TRBS_PER_SEGMENT to 256
> >      commit 18cc2f4cbbaf825a4fedcf2d60fd388d291e0a38 upstream.
> > 
> > (3)  xhci: fix isoc endpoint dequeue from advancing too far on transaction error
> >      commit d104d0152a97fade389f47635b73a9ccc7295d0b upstream.
> > 
> > The first one seemed fairly irrelevant.  The third one mentions DMA, but
> > looking at the actual changes, I couldn't see an obvious connection to the
> > 'PTE read access' issue.  So, I tested reverting the second patch in my
> > current (4.1.3) kernel and my USB 3.0 controller started working again.
> > 
> > This is good enough for me, but I thought you might want to look more closely
> > at the root cause, and ensure that the PTE is correctly setup even with the
> > increased TRBS_PER_SEGMENT.
> > 
> 
> We just found that the TRBS_PER_SEGMENT reveals an old off by one error in an upper boundary check.
> Previously a ring segment didn't use a full memory page, and each ring segment was allocated a new page, so the +1 off by one never caused any harm.
> 
> Now that we use the full memory page the off by one actually allowed us going past the allocated page.
> 
> This is fixed in patch 
> commit 7895086afde2a05fa24a0e410d8e6b75ca7c8fdd
>     xhci: fix off by one error in TRB DMA address boundary check
> 
> Fix is now in  Greg's usb-linus branch, and should end up in linus 4.2 kernel (rc6 earliest)
> 
> Does that fix help in your case?

No, actually, it did not; although at first I thought it did,
but that was likely because my patch reverting the TRBS_PER_SEGMENT
adjustment was (accidentally) still active when I did the testing.
I did more thorough tests with patch-free 4.1.3 and 4.1.4 yesterday,
and the results are the following: fixing the off-by-one error alone
did not help for 4.1.3/4.1.4, I still had to revert the TRBS_PER_SEGMENT
patch as well to get a working system.

In fact, the same problem persists even for my current kernel (4.3.5),
except that there it's enought with reverting the TRBS_PER_SEGMENT
patch, since the fix for the off-by-one error has long been incorported
into mainline.

I'm still interested in tracking this down, although the issue may be
quite specific to the combination of older HW in my stationary box
(GA-EQ45M-S2 motherboard with LogiLink PC0057 PCI-e 4-port USB3 card),
since I never had any problems running the exact same kernel on a
variety of other (more modern, mostly laptop) HW.

Regards,


--@;
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux