Hi All, Now some SD/MMC controllers can support packed command or packed request, that means it can package multiple requests to host controller to be handled at one time, which can improve the I/O performence. Thus this patch set tries to add the MMC packed request function to support packed request or packed command. In this patch set, I extanded commit_rqs() to do batch processing suggested by Ming, to allow dispatching a batch of requests to hardware and expanded the MMC software queue to support packed request. I also implemented the SD host ADMA3 transfer mode to support packed request. The ADMA3 transfer mode can process a multi-block data transfer by using a pair of command descriptor and ADMA2 descriptor. In future we can easily expand the MMC packed function to support packed command. Below are some comparison data between packed request and non-packed request with fio tool. The fio command I used is like below with changing the '--rw' parameter and enabling the direct IO flag to measure the actual hardware transfer speed. I tested 5 times for each case and output a average speed. ./fio --filename=/dev/mmcblk0p30 --direct=1 --iodepth=20 --rw=read --bs=4K --size=512M --group_reporting --numjobs=20 --name=test_read My eMMC card working at HS400 Enhanced strobe mode: [ 2.229856] mmc0: new HS400 Enhanced strobe MMC card at address 0001 [ 2.237566] mmcblk0: mmc0:0001 HBG4a2 29.1 GiB [ 2.242621] mmcblk0boot0: mmc0:0001 HBG4a2 partition 1 4.00 MiB [ 2.249110] mmcblk0boot1: mmc0:0001 HBG4a2 partition 2 4.00 MiB [ 2.255307] mmcblk0rpmb: mmc0:0001 HBG4a2 partition 3 4.00 MiB, chardev (248:0) 1. Non-packed request 1) Sequential read: Speed: 59.2MiB/s, 60.4MiB/s, 63.6MiB/s, 60.3MiB/s, 59.9MiB/s Average speed: 60.68MiB/s 2) Random read: Speed: 31.3MiB/s, 31.4MiB/s, 31.5MiB/s, 31.3MiB/s, 31.3MiB/s Average speed: 31.36MiB/s 3) Sequential write: Speed: 71MiB/s, 71.8MiB/s, 72.3MiB/s, 72.2MiB/s, 71MiB/s Average speed: 71.66MiB/s 4) Random write: Speed: 68.9MiB/s, 68.7MiB/s, 68.8MiB/s, 68.6MiB/s, 68.8MiB/s Average speed: 68.76MiB/s 2. Packed request 1) Sequential read: Speed: 230MiB/s, 230MiB/s, 229MiB/s, 230MiB/s, 229MiB/s Average speed: 229.6MiB/s 2) Random read: Speed: 181MiB/s, 181MiB/s, 181MiB/s, 180MiB/s, 181MiB/s Average speed: 180.8MiB/s 3) Sequential write: Speed: 175MiB/s, 171MiB/s, 171MiB/s, 172MiB/s, 171MiB/s Average speed: 172MiB/s 4) Random write: Speed: 169MiB/s, 169MiB/s, 171MiB/s, 167MiB/s, 170MiB/s Average speed: 169.2MiB/s >From above data, we can see the packed request can improve the performance greatly. Any comments are welcome. Thanks a lot. Changes from RFC v1: - Re-implement the batch processing according to Ming's suggestion - Remove the bd.last validation in MMC block.c, since we always get bd.last == false according to the new batch processing method. Baolin Wang (6): mmc: Add MMC packed request support for MMC software queue mmc: host: sdhci: Introduce ADMA3 transfer mode mmc: host: sdhci: Factor out the command configuration mmc: host: sdhci: Remove redundant sg_count member of struct sdhci_host mmc: host: sdhci: Add MMC packed request support mmc: host: sdhci-sprd: Add MMC packed request support Ming Lei (1): block: Extand commit_rqs() to do batch processing block/blk-mq-sched.c | 29 +- block/blk-mq.c | 15 +- drivers/mmc/core/block.c | 14 + drivers/mmc/core/core.c | 26 ++ drivers/mmc/core/core.h | 2 + drivers/mmc/core/queue.c | 19 +- drivers/mmc/host/mmc_hsq.c | 292 +++++++++++++++++--- drivers/mmc/host/mmc_hsq.h | 25 +- drivers/mmc/host/sdhci-sprd.c | 30 +- drivers/mmc/host/sdhci.c | 504 +++++++++++++++++++++++++++++----- drivers/mmc/host/sdhci.h | 61 +++- include/linux/blk-mq.h | 1 + include/linux/mmc/core.h | 6 + include/linux/mmc/host.h | 9 + 14 files changed, 900 insertions(+), 133 deletions(-) -- 2.17.1