Re: 6.5.0 broke XHCI URB submissions for count >512

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[adding the people involved in developing and applying the culprit to
the list of recipients]

Hi! Thx for the report.

On 02.03.24 01:27, Chris Yokum wrote:
> We have found a regression bug, where more than 512 URBs cannot be
> reliably submitted to XHCI. URBs beyond that return 0x00 instead of
> valid data in the buffer.
> 
> Our software works reliably on kernel versions through 6.4.x and fails
> on versions 6.5, 6.6, 6.7, and 6.8.0-rc6. This was discovered when
> Ubuntu recently updated their latest kernel package to version 6.5.
> 
> The issue is limited to the XHCI driver and appears to be isolated to
> this specific commit:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/usb?h=v6.5&id=f5af638f0609af889f15c700c60b93c06cc76675 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/usb?h=v6.5&id=f5af638f0609af889f15c700c60b93c06cc76675>

FWIW, that's f5af638f0609af ("xhci: Fix transfer ring expansion size
calculation") [v6.5-rc1] from Mathias.

> Attached is a test program that demonstrates the problem. We used a few
> different USB-to-Serial adapters with no driver installed as a
> convenient way to reproduce. We check the TRB debug information before
> and after to verify the actual number of allocated TRBs.
> 
> With some adapters on unaffected kernels, the TRB map gets expanded
> correctly. This directly corresponds to correct functional behavior. On
> affected kernels, the TRB ring does not expand, and our functional tests
> also will fail.
> 
> We don't know exactly why this happens. Some adapters do work correctly,
> so there seems to also be some subtle problem that was being masked by
> the liberal expansion of the TRB ring in older kernels. We also saw on
> one system that the TRB expansion did work correctly with one particular
> adapter. However, on all systems at least two adapters did exhibit the
> problem and fail.
> 
> Would it be possible to resolve this regression for the 6.8 release and
> backport the fix to versions 6.5, 6.6, and 6.7?

6.5 is EOL at kernel.org, that's thus up to downstream distros. And with
a bit of luck it might be possible to fix this for 6.8, but it might be
too late for that already. We'll see.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux