Hi, On Tue, Aug 02, 2011 at 08:33:27AM +0800, loody wrote: > 2011/8/2 Felipe Balbi <balbi@xxxxxx>: > > Hi, > > > > (please break your lines at 80 characters ;-) > > > > On Mon, Aug 01, 2011 at 02:57:42PM -0700, Perry Wagle wrote: > >> If I have a 2.6.31.6 kernel running on an ARM, what kind of > >> performance can I get how? I'm getting about 14 MB/sec read and write > > > > it pretty much depends on the controller you're using and how optimized > > the driver for that controller is. This is really a case-by-case > > analysis. While the stack itself poses some overhead, I would say > > (didn't measure ok ;-) Linux's USB stacks (host and device side) are > > quite low overhead... > > > >> and the customer of my customer (I'm an independent contractor) is > >> expecting 20-30. Is that possible in linux? (I get the same > > > > well, if the HW _can_ do better and the SW isn't optimized, then yes, > > why not ?!? > > > >> performance on a x86_64 running ubuntu 10.10). What's the deal we can > >> tell them? > > > > we can't really tell you how to talk to your customers, what we can tell > > is that without further information on the setup (which controller, > > which device, who's playing the role as Host and who's playing role as > > Device, which chipset on Host side, which disk are you using, etc) it's > > quite difficult to give any tips. > > > per my experience. > the performance will depend on following factors: > 1. the speed of your device, since different device will adopt > different algorithm to handle read/write. > 2. the controller, as Felipe mentioned. For example, different > controller will adopt different memory bus for read/write, ddr2 or > ddr3. > 3. how busy is your arm system. there's more. If he's running as a device, the host plays quite a big role. From my tests, I could see that intel EHCI chips could only do 8 bulk-out per uframe, while NVidia chips could squeeze 11 bulk-out per uframe. This means that using the same device, with the same kernel and the same test application gave me different results when using different host controllers (30MB/s with Intel's and 37MB/s with NVidia's). So, like I said, this is really a case-by-case analysis. There's always some optimization to be achieved on DMA handling, for example. Maybe queueing many transfers to DMA, or using bigger buffers, or using DMA chaining (where possible), etc. -- balbi
Attachment:
signature.asc
Description: Digital signature