Re: mass storage behaviour

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05 Oct 2015, at 20:08, Felipe Balbi <balbi@xxxxxx> wrote:

> On Mon, Oct 05, 2015 at 07:30:05PM +0200, Paul Jones wrote:
>> I’m investigating the (lack of) performance (around 150MB/s) of the USB3380
>> gadget in mass storage mode.  Whilst tracing on a Linux 4.1 host I noticed
>> that the Linux max storage driver is requesting 240 blocks, 16 blocks, 240
>> blocks, 16 blocks, etc. when doing a dd directly on the device: dd if=/dev/sdb
>> of=/dev/null bs=64k count=8k where /dev/sdb is the emulated device.  The
>> emulated device is provided from a secondary machine with a USB3380 card
>> emulating the mass storage device from a local SSD (local dd of the disk file
>> reads at 542 MB/s).
>> 
>> lsusb on the host:
>> /:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/3p, 480M
>>    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/8p, 480M
>> /:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/3p, 480M
>>    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/6p, 480M
>> /:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
>>    |__ Port 3: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
>>    |__ Port 6: Dev 4, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
>> /:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/15p, 480M
>> 
>> lspci on the host:
>> 00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)
>> 00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)
>> 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
>> 00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller (rev 06)
>> 00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 05)
>> 00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 (rev 04)
>> 00:16.3 Serial controller: Intel Corporation 8 Series/C220 Series Chipset Family KT Controller (rev 04)
>> 00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-LM (rev 05)
>> 00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 05)
>> 00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 05)
>> 00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d5)
>> 00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #3 (rev d5)
>> 00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 (rev 05)
>> 00:1f.0 ISA bridge: Intel Corporation Q87 Express LPC Controller (rev 05)
>> 00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 05)
>> 00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller (rev 05)
>> 
>> Tracing using usbmon on the host shows:
>> ffff88002ea45b40 558636438 S Bo:2:003:2 -115 31 = 55534243 63c20000 00000000 00000600 00000000 00000000 00000000 000000
>> ffff88002ea45b40 558636455 C Bo:2:003:2 0 31 >
>> ffff88002ea45b40 558636459 S Bi:2:003:1 -115 13 <
>> ffff88002ea45b40 558636537 C Bi:2:003:1 0 13 = 55534253 63c20000 00000000 00
>> ffff88002ea45b40 558636593 S Bo:2:003:2 -115 31 = 55534243 64c20000 00000000 0000061e 00000001 00000000 00000000 000000
>> ffff88002ea45b40 558636610 C Bo:2:003:2 0 31 >
>> ffff88002ea45b40 558636627 S Bi:2:003:1 -115 13 <
>> ffff88002ea45b40 558636669 C Bi:2:003:1 0 13 = 55534253 64c20000 00000000 00
>> ffff88002ea45b40 558636748 S Bo:2:003:2 -115 31 = 55534243 65c20000 00e00100 80000a28 00000000 000000f0 00000000 000000
>> ffff88002ea45b40 558636757 C Bo:2:003:2 0 31 >
>> ffff880407982f00 558636765 S Bi:2:003:1 -115 122880 <
>> ffff880407982f00 558637699 C Bi:2:003:1 0 122880 = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> ffff88002ea45b40 558637717 S Bi:2:003:1 -115 13 <
>> ffff88002ea45b40 558637728 C Bi:2:003:1 0 13 = 55534253 65c20000 00000000 00
>> ffff88002ea45b40 558637760 S Bo:2:003:2 -115 31 = 55534243 66c20000 00200000 80000a28 00000000 f0000010 00000000 000000
>> ffff88002ea45b40 558637778 C Bo:2:003:2 0 31 >
>> ffff88040ca76a80 558637797 S Bi:2:003:1 -115 8192 <
>> ffff88040ca76a80 558637892 C Bi:2:003:1 0 8192 = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> ffff88002ea45b40 558637908 S Bi:2:003:1 -115 13 <
>> ffff88002ea45b40 558637933 C Bi:2:003:1 0 13 = 55534253 66c20000 00000000 00
>> ffff88002ea45b40 558637959 S Bo:2:003:2 -115 31 = 55534243 67c20000 00e00100 80000a28 00000001 000000f0 00000000 000000
>> ffff88002ea45b40 558637973 C Bo:2:003:2 0 31 >
>> ffff880407982f00 558637976 S Bi:2:003:1 -115 122880 <
>> ffff880407982f00 558638898 C Bi:2:003:1 0 122880 = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> ffff88002ea45b40 558638901 S Bi:2:003:1 -115 13 <
>> ffff88002ea45b40 558638938 C Bi:2:003:1 0 13 = 55534253 67c20000 00000000 00
>> ffff88002ea45b40 558638976 S Bo:2:003:2 -115 31 = 55534243 68c20000 00200000 80000a28 00000001 f0000010 00000000 000000
>> ffff88002ea45b40 558638984 C Bo:2:003:2 0 31 >
>> ffff88040ca76a80 558638992 S Bi:2:003:1 -115 8192 <
>> ffff88040ca76a80 558639095 C Bi:2:003:1 0 8192 = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> ffff88002ea45b40 558639110 S Bi:2:003:1 -115 13 <
>> ffff88002ea45b40 558639136 C Bi:2:003:1 0 13 = 55534253 68c20000 00000000 00
>> 
>> Any ideas why the driver is requesting varying block sizes?
>> 
>> It also seems odd that the mass storage gadget takes a long time to respond to
>> the CBW of each read request (as it doesn’t actually need to return the data
>> at that point).
>> 
>> It seems that the gadget is waiting for the underlying request to retrieve
>> data blocks to actually finish, before acknowledging the request, as the
>> response time after a 240 block request is consistently much longer than the
>> response time after a 16 block request.
>> 
>> Wouldn’t it be more efficient to immediately acknowledge the request, kick off
>> the underlying request to retrieve the data blocks in the gadget and use flow
>> control in the following BULK IN request if the data hasn’t arrived yet or is
>> arriving slower than the USB line speed?
>> 
>> Or is there some parameter that I have overlooked that controls/influences
>> this behaviour?
> 
> g_mass_storage, by default, uses 2 struct usb_request, try increasing that to 4
> (can be done from make menuconfig itself) and see if anything changes.
If you are talking about the “number of storage pipeline buffers” I already have them at 4.
I had similar results in previous kernels where I hadn’t set this value to 4.

> I have been messing around with g_mass_storage myself and noticed some
> interesting behavior too which I'm yet to fully understand, but let's first make
> sure you're running into the same thing.
I’ll be happy to work on anything that improves the performance of this as unfortunately the USB3380 doesn’t support UAS (no streams support).
I’ve been reading up on the USB spec in an effort to understand the kernel code but still have a long way to go in figuring out how it works.
In the mean time I will be happy to run some tests if you have any experimental code you would like to have tested or if you have any pointers on where improvements could be made.
I can also send you a USB3380 pcie card if you prefer running the tests yourself.

Paul. --
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux