Hi, On 6/17/2021 2:55 PM, Ferry Toth wrote: > Hi > > Op 17-06-2021 om 23:48 schreef Wesley Cheng: >> Hi, >> >> On 6/17/2021 2:01 PM, Ferry Toth wrote: >>> Hi >>> >>> Op 17-06-2021 om 11:58 schreef Wesley Cheng: >>>> Changes in V10: >>>> - Fixed compilation errors in config where OF is not used (error due to >>>> unknown symbol for of_add_property()). Add of_add_property() stub. >>>> - Fixed compilation warning for incorrect argument being passed to >>>> dwc3_mdwidth >>> This fixes the OOPS I had in V9. I do not see any change in performance >>> on Merrifield though. >> I see...thanks Ferry! With your testing, are you writing to the device's >> internal storage (ie UFS, eMMC, etc...), or did you use a ramdisk as well? > > In this case I just tested the EEM path using iperf3. > Got it. I don't believe f_eem will use a high enough (if at all) bMaxBurst value to change the TXFIFO size. >> If not with a ramdisk, we might want to give that a try to avoid the >> storage path being the bottleneck. You can use "dd" to create an empty >> file, and then just use that as the LUN's backing file. >> >> echo ramdisk.img > >> /sys/kernel/config/usb_gadget/g1/functions/mass_storage.0/lun.0/file > > Ah, why didn't I think of that. I have currently mass storage setup with > eMMC but it seems that is indeed the bottleneck. > :), not a problem...I've been working on getting the ideal set up for the performance profiling for awhile, so anything I can do to make sure we get some good results. > I'll try with a ramdisk and let you know. > Thanks again for the testing, Ferry. Thanks Wesley Cheng >> Thanks >> Wesley Cheng >> >>>> Changes in V9: >>>> - Fixed incorrect patch in series. Removed changes in DTSI, as >>>> dwc3-qcom will >>>> add the property by default from the kernel. >>>> >>>> Changes in V8: >>>> - Rebased to usb-testing >>>> - Using devm_kzalloc for adding txfifo property in dwc3-qcom >>>> - Removed DWC3 QCOM ACPI property for enabling the txfifo resize >>>> >>>> Changes in V7: >>>> - Added a new property tx-fifo-max-num for limiting how much fifo >>>> space the >>>> resizing logic can allocate for endpoints with large burst >>>> values. This >>>> can differ across platforms, and tie in closely with overall >>>> system latency. >>>> - Added recommended checks for DWC32. >>>> - Added changes to set the tx-fifo-resize property from dwc3-qcom by >>>> default >>>> instead of modifying the current DTSI files. >>>> - Added comments on all APIs/variables introduced. >>>> - Updated the DWC3 YAML to include a better description of the >>>> tx-fifo-resize >>>> property and added an entry for tx-fifo-max-num. >>>> >>>> Changes in V6: >>>> - Rebased patches to usb-testing. >>>> - Renamed to PATCH series instead of RFC. >>>> - Checking for fs_descriptors instead of ss_descriptors for >>>> determining the >>>> endpoint count for a particular configuration. >>>> - Re-ordered patch series to fix patch dependencies. >>>> >>>> Changes in V5: >>>> - Added check_config() logic, which is used to communicate the >>>> number of EPs >>>> used in a particular configuration. Based on this, the DWC3 >>>> gadget driver >>>> has the ability to know the maximum number of eps utilized in all >>>> configs. >>>> This helps reduce unnecessary allocation to unused eps, and will >>>> catch fifo >>>> allocation issues at bind() time. >>>> - Fixed variable declaration to single line per variable, and >>>> reverse xmas. >>>> - Created a helper for fifo clearing, which is used by ep0.c >>>> >>>> Changes in V4: >>>> - Removed struct dwc3* as an argument for dwc3_gadget_resize_tx_fifos() >>>> - Removed WARN_ON(1) in case we run out of fifo space >>>> Changes in V3: >>>> - Removed "Reviewed-by" tags >>>> - Renamed series back to RFC >>>> - Modified logic to ensure that fifo_size is reset if we pass the >>>> minimum >>>> threshold. Tested with binding multiple FDs requesting 6 FIFOs. >>>> >>>> Changes in V2: >>>> - Modified TXFIFO resizing logic to ensure that each EP is reserved a >>>> FIFO. >>>> - Removed dev_dbg() prints and fixed typos from patches >>>> - Added some more description on the dt-bindings commit message >>>> >>>> Currently, there is no functionality to allow for resizing the >>>> TXFIFOs, and >>>> relying on the HW default setting for the TXFIFO depth. In most >>>> cases, the >>>> HW default is probably sufficient, but for USB compositions that contain >>>> multiple functions that require EP bursting, the default settings >>>> might not be enough. Also to note, the current SW will assign an EP to a >>>> function driver w/o checking to see if the TXFIFO size for that >>>> particular >>>> EP is large enough. (this is a problem if there are multiple HW defined >>>> values for the TXFIFO size) >>>> >>>> It is mentioned in the SNPS databook that a minimum of TX FIFO depth = 3 >>>> is required for an EP that supports bursting. Otherwise, there may be >>>> frequent occurences of bursts ending. For high bandwidth functions, >>>> such as data tethering (protocols that support data aggregation), mass >>>> storage, and media transfer protocol (over FFS), the bMaxBurst value >>>> can be >>>> large, and a bigger TXFIFO depth may prove to be beneficial in terms >>>> of USB >>>> throughput. (which can be associated to system access latency, >>>> etc...) It >>>> allows for a more consistent burst of traffic, w/o any interruptions, as >>>> data is readily available in the FIFO. >>>> >>>> With testing done using the mass storage function driver, the results >>>> show >>>> that with a larger TXFIFO depth, the bandwidth increased significantly. >>>> >>>> Test Parameters: >>>> - Platform: Qualcomm SM8150 >>>> - bMaxBurst = 6 >>>> - USB req size = 256kB >>>> - Num of USB reqs = 16 >>>> - USB Speed = Super-Speed >>>> - Function Driver: Mass Storage (w/ ramdisk) >>>> - Test Application: CrystalDiskMark >>>> >>>> Results: >>>> >>>> TXFIFO Depth = 3 max packets >>>> >>>> Test Case | Data Size | AVG tput (in MB/s) >>>> ------------------------------------------- >>>> Sequential|1 GB x | >>>> Read |9 loops | 193.60 >>>> | | 195.86 >>>> | | 184.77 >>>> | | 193.60 >>>> ------------------------------------------- >>>> >>>> TXFIFO Depth = 6 max packets >>>> >>>> Test Case | Data Size | AVG tput (in MB/s) >>>> ------------------------------------------- >>>> Sequential|1 GB x | >>>> Read |9 loops | 287.35 >>>> | | 304.94 >>>> | | 289.64 >>>> | | 293.61 >>>> ------------------------------------------- >>>> >>>> Wesley Cheng (6): >>>> usb: gadget: udc: core: Introduce check_config to verify USB >>>> configuration >>>> usb: gadget: configfs: Check USB configuration before adding >>>> usb: dwc3: Resize TX FIFOs to meet EP bursting requirements >>>> of: Add stub for of_add_property() >>>> usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default >>>> dt-bindings: usb: dwc3: Update dwc3 TX fifo properties >>>> >>>> .../devicetree/bindings/usb/snps,dwc3.yaml | 15 +- >>>> drivers/usb/dwc3/core.c | 9 + >>>> drivers/usb/dwc3/core.h | 15 ++ >>>> drivers/usb/dwc3/dwc3-qcom.c | 9 + >>>> drivers/usb/dwc3/ep0.c | 2 + >>>> drivers/usb/dwc3/gadget.c | 212 >>>> +++++++++++++++++++++ >>>> drivers/usb/gadget/configfs.c | 22 +++ >>>> drivers/usb/gadget/udc/core.c | 25 +++ >>>> include/linux/of.h | 5 + >>>> include/linux/usb/gadget.h | 5 + >>>> 10 files changed, 317 insertions(+), 2 deletions(-) >>>> -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project