--- On Sun, 6/20/10, Oliver Neukum <oliver@xxxxxxxxxx> wrote: > wrote: > > > > > > > Am Donnerstag, 17. Juni 2010 05:41:24 > schrieb > > > David Brownell: > > > Not recently, except in the loose sense that > > if there's going to be such batching code > > for NCM, it ought to be reusable. > > OK, so we need to discuss > > 1) do we want batching To support NCM, EEM, and RNDIS we do. but recall my point was that this is just a way to work around USB stacks that have weak support for transfer queues ... So we're stuck with "wanting" to work around USB stacks that don't work very well.What I'd call "stuck with" not "want". > 2) if so, how do we do it > 3) how do we make it reusable > > Right? Exactly. > > > > Right NOW for TX, isn't that the only solution? > > Pretty much. But we may be able to innovate. > > > ... Unless you do like the RNDIS code and stick > > to one network packet per batch, which lets you > > use much smaller buffers (TX only). and > > thus avoid a lot of data copies for TX. > > > > And for RX ... isn't the solution the converse, > > but sharing the same packet buffer between all > > the single-packet SKBs extracted from that > > huge URB transfer buffer? > > I think so. But it seems to me that for RX the situation > is > worse. Given batching, it's much worse because the buffer sizes are huge ("jumbograms") and all those wierd alignment restrictions exist. (Notice how normal network packets don't have such restrictions. That's a cue that strange stuff is going on ... For TX we might use scatter/gather. For RX that > is not possible, as we cannot predict where the datagramms > will start or stop. > > > > The problem here, as David pointed out, is that > we must > > > copy > > > each datagramm. > > > > Copy on TX. You'll observe that for example > > the RNDIS RX code shares the underlying packet > > buffer (big) between the various packets which > > get extracted; ... to avoid copying, but at a > > cost in terms of memory fragmentation... > > Do you have an alternative? Don't bundle packets ... just use queues effectively to avoid wasted bandwidth between USB transfers, and to have the network packets go directly into and out of their buffers. :) > > There's a pragmatic issue with that: allocating > > big SKBs fragments the relevant memory pools, and > > isn't even guaranteed to work. > > > > THat's another reason to prefer solutions that > > stick to queuing single network packets in the > > USB transfer queues. > > Why? I can't see the reason for TX. We could > always fall back to smaller buffers, couldn't we? Both TX and RX benefit from smaller buffers that are consistently sized, and never require memcpy(). > > Thus I thought about what we can do to > > > avoid > > > a copy. We can avoid a copy if and only if we can > fit the > > > NTH, > > > NDH and padding for alignment into the buffer > associated > > > with > > > the skb we are given. And that extra data isn't actually needed; consider that CDC Ethernet works just fine without them, and approaches (closely!) the peak USB bandwidth without memcpy() costs. > > > > That can be assured given MTU tricks; upper > > layers of the network stack can be made to > > pre-allocate that memory, at least for TX paths. > > How does that work? alloc_skb() or whatever, just returns bigger buffers. > > for RX it's less certain, unless the other end > > can be made to adopt the same "only one packet > > per bundle" policy. > > Probably it can. The question is how hard that would > affect performance. That's implementation-specific. On Linux the effect is hardly observable. On MS-Windows I suspect it'd hurt ... they wouldn't go through so much work to infect protocols with costly mechanisms if their implementations didn't need them badly... right? > > > The trivial case is "one packet per bundle". > It's > > only larger bundles that get complicated and slow. > > Do they? I mean they obviously complicatre stuff on > the host side, but is that outweighed by gains on the > device > side? When I looked at it ... no, not outweighed in most cases. peripherals don't have DMA chaining in most cases, but they do have DMA, and sane implementation strategies (not a given!!) can use that to good effect without a need to bundle. - Dave -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html