Re: [Bug 217242] CPU hard lockup related to xhci/dma

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/2/23 20:57, Alan Stern wrote:
[Bugzilla removed from the CC: list, since this isn't relevant to the bug
report]

On Sun, Apr 02, 2023 at 07:25:27PM +0200, Greg KH wrote:
On Sun, Apr 02, 2023 at 05:54:18PM +0200, Hans Petter Selasky wrote:
While that being said, I wish the Linux USB core would take the example of
the FreeBSD USB core, and pre-allocate all memory needed for USB transfers,
also called URB's, during device attach.

Many drivers do that today already, which specific ones do you think
need to have this added that are not doing so?

Hans is undoubtedly referring to the host controller drivers.

Hi Alan,

Yes, I'm on the USB host side this time.

usb_alloc_urb() allocates memory for the URB itself.  But the routine does
not know which device or host controller the URB will eventually be used
with, so it doesn't know which HCD to tell to set aside adequate memory
for handling the URB once it is submitted.  And since HCDs tend to process
URB submissions while holding a private spinlock, when their memory
allocation does get done it cannot use GFP_KERNEL.

I remember a long time ago when memory allocation was very slow in FreeBSD, testing the USB control endpoint was difficult, without at the same time using 100% CPU. The reason was user-space applications used IOCTL's to do USB control endpoint requests synchronously, and that leaded to the request data being alloc'ed and free'd regularly. That was before jemalloc and per-CPU slabs. It was not the amount of data causing problems, but the request rate, 1000 - 8000 requests per second typically. Finding free holes in memory bitmaps due to fragmentation is _very_ expensive!


I think it's fair to call this a weak point in Linux's USB stack.
Balancing this, it should be pointed out that we can't always know in
advance how large an URB's transfer buffer will be, and the amount of
memory that the HCD will need can depend on this size.
>

In FreeBSD you have to specify a maximum length in bytes per "urb" or FreeBSD USB transfer, and various other static properties. Then you don't allocate and free those URB's so to speak, but just keep on re-using them, after first time allocation. All XHCI DMA structures are then just pre-allocated, because we know the PAGE_SIZE and how stuff is laid out into memory, it's easy to compute exactly the worst and best case for the number for hardware structures you need.

This is also very useful for boot-loaders, that FreeBSD USB can either run all single threaded with few fixed size memory pools, or multi threaded as part of a bigger OS.

Frequently going through allocate
and free cycles during operation, is not just inefficient, but also greatly

In fact, the original Slab memory allocator (in Solaris 2.4) was designed
to make frequent allocate-and-free cycles extremely efficient.  So much so
that people would just naturally do things that way instead of
pre-allocating memory which would then just sit around unused a large
fraction of the time.

I suspect the allocators in the Linux kernel don't end up being quite as
efficient as the original Slab, however.


FreeBSD USB is a completely different design compared to Linux. Anyway, back to the topic and thanks for the chat :-)

--HPS



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux