On 4/2/23 20:57, Alan Stern wrote:
[Bugzilla removed from the CC: list, since this isn't relevant to the bug
report]
On Sun, Apr 02, 2023 at 07:25:27PM +0200, Greg KH wrote:
On Sun, Apr 02, 2023 at 05:54:18PM +0200, Hans Petter Selasky wrote:
While that being said, I wish the Linux USB core would take the example of
the FreeBSD USB core, and pre-allocate all memory needed for USB transfers,
also called URB's, during device attach.
Many drivers do that today already, which specific ones do you think
need to have this added that are not doing so?
Hans is undoubtedly referring to the host controller drivers.
Hi Alan,
Yes, I'm on the USB host side this time.
usb_alloc_urb() allocates memory for the URB itself. But the routine does
not know which device or host controller the URB will eventually be used
with, so it doesn't know which HCD to tell to set aside adequate memory
for handling the URB once it is submitted. And since HCDs tend to process
URB submissions while holding a private spinlock, when their memory
allocation does get done it cannot use GFP_KERNEL.
I remember a long time ago when memory allocation was very slow in
FreeBSD, testing the USB control endpoint was difficult, without at the
same time using 100% CPU. The reason was user-space applications used
IOCTL's to do USB control endpoint requests synchronously, and that
leaded to the request data being alloc'ed and free'd regularly. That was
before jemalloc and per-CPU slabs. It was not the amount of data causing
problems, but the request rate, 1000 - 8000 requests per second
typically. Finding free holes in memory bitmaps due to fragmentation is
_very_ expensive!
I think it's fair to call this a weak point in Linux's USB stack.
Balancing this, it should be pointed out that we can't always know in
advance how large an URB's transfer buffer will be, and the amount of
memory that the HCD will need can depend on this size.
>
In FreeBSD you have to specify a maximum length in bytes per "urb" or
FreeBSD USB transfer, and various other static properties. Then you
don't allocate and free those URB's so to speak, but just keep on
re-using them, after first time allocation. All XHCI DMA structures are
then just pre-allocated, because we know the PAGE_SIZE and how stuff is
laid out into memory, it's easy to compute exactly the worst and best
case for the number for hardware structures you need.
This is also very useful for boot-loaders, that FreeBSD USB can either
run all single threaded with few fixed size memory pools, or multi
threaded as part of a bigger OS.
Frequently going through allocate
and free cycles during operation, is not just inefficient, but also greatly
In fact, the original Slab memory allocator (in Solaris 2.4) was designed
to make frequent allocate-and-free cycles extremely efficient. So much so
that people would just naturally do things that way instead of
pre-allocating memory which would then just sit around unused a large
fraction of the time.
I suspect the allocators in the Linux kernel don't end up being quite as
efficient as the original Slab, however.
FreeBSD USB is a completely different design compared to Linux. Anyway,
back to the topic and thanks for the chat :-)
--HPS