--- On Fri, 6/18/10, Oliver Neukum <oliver@xxxxxxxxxx> wrote: > > > Am Donnerstag, 17. Juni 2010 05:41:24 schrieb > David Brownell: > > > > Oliver, I sill don't understand what you're > > > > trying to say,or how it relates to the > > > > structural point I was making: that > the > > > > batching isn'treally needed (or > helpful) > > > > given sane USB DMA/transfer queues > > > > (as on Linux). > > > > We were talking about an implementation of batching network > packets for transfer used by NCM and possibly other > drivers. Not recently, except in the loose sense that if there's going to be such batching code for NCM, it ought to be reusable. The messages to which you responded were on a different topic: namely, that batching was just a workaround for poor transfer queue support ... and given sane transfer queueing support, implementations could just as easily use that (while avoiding the need to memcpy every data packet). If you wanted to change the topic, it would really have helped to change $SUBJECT. For example "how to implement NCM style batching". Thus > the requirements of NCM at least have to be met. > > Looking at chapter 3.1 of the NCM specification, it seems > to me that > > a) the host must transfer all data associated with an NTH > without > short packets or ZLPs until the end > b) after each short packet or ZLP a new NTH must be sent > > In addition if you look at chapter 3.3.4 of the NCM > specification > it is clear that the host must meet fairly arbitrary > alignment requirements, > which the device specifies at runtime. > > Now we are within the spec if we send out our data with an > NTH, > and NDH and a properly aligned datagramm. I'd have to get a new copy of the NCM spec and read it again, in order to comment at that level of detail... However, I think of the issue in a slightly different way: namely, that what the batching requires drivers to construct USB transfers (1.N full size packets then a short one to terminate) out of multiple SKBs (and for the sake of argument, I'll assume the various NCM headers are packaged in SKBs too ... possibly discrete, possibly prepended to other SKBs. We don't currently have a way to describe USB transfers except as the single buffer associated with an URB. Specifically, we can't build one transfer out of two or more URBs. > It seems to me that we can trivially meet the requirements > of the > NCM specification by allocating a large buffer and copying > the datagramms > (most likely ethernet frames) with the proper alignment > into > the buffer and transfer it by means of one URB. > Right NOW for TX, isn't that the only solution? ... Unless you do like the RNDIS code and stick to one network packet per batch, which lets you use much smaller buffers (TX only). and thus avoid a lot of data copies for TX. And for RX ... isn't the solution the converse, but sharing the same packet buffer between all the single-packet SKBs extracted from that huge URB transfer buffer? > The problem here, as David pointed out, is that we must > copy > each datagramm. Copy on TX. You'll observe that for example the RNDIS RX code shares the underlying packet buffer (big) between the various packets which get extracted; ... to avoid copying, but at a cost in terms of memory fragmentation... There's a pragmatic issue with that: allocating big SKBs fragments the relevant memory pools, and isn't even guaranteed to work. THat's another reason to prefer solutions that stick to queuing single network packets in the USB transfer queues. Thus I thought about what we can do to > avoid > a copy. We can avoid a copy if and only if we can fit the > NTH, > NDH and padding for alignment into the buffer associated > with > the skb we are given. That can be assured given MTU tricks; upper layers of the network stack can be made to pre-allocate that memory, at least for TX paths. for RX it's less certain, unless the other end can be made to adopt the same "only one packet per bundle" policy. . > with multiple > URBs the way we do it for storage. Storage carefully keeps each URB's buffer distinct, and doesn't try to map/combine them into a single ginormous transfer, holding the equivalent of multiple Ethernet packets. That however, as we may not send > short packets is possible only if we give each URB a buffer > that is > a multiple of the device's maximum packet size. And I > wondered > whether we can meet this requirement without a copy. > > What do you think? I think you need to consider the RX side too, and see what parameter negotiations each of the protocols supports: packets-per-bundle, etc. The trivial case is "one packet per bundle". It's only larger bundles that get complicated and slow. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html