Hi Here is V14 of the hardware command queue patches without the software command queue patches, now using blk-mq and now with blk-mq support for non-CQE I/O. V14 includes a number of fixes to existing code, changes to default to blk-mq, and adds patches to remove legacy code. HW CMDQ offers 25% - 50% better random multi-threaded I/O. I see a slight 2% drop in sequential read speed but no change to sequential write. Non-CQE blk-mq showed a 3% decrease in sequential read performance. This seemed to be coming from the inferior latency of running work items compared with a dedicated thread. Hacking blk-mq workqueue to be unbound reduced the performance degradation from 3% to 1%. While we should look at changing blk-mq to give better workqueue performance, a bigger gain is likely to be made by adding a new host API to enable the next already-prepared request to be issued directly from within ->done() callback of the current request. Changes since V13: mmc: block: Fix missing blk_put_request() New patch. mmc: block: Check return value of blk_get_request() New patch. mmc: core: Do not leave the block driver in a suspended state New patch. mmc: block: Ensure that debugfs files are removed New patch. mmc: block: No need to export mmc_cleanup_queue() New patch. mmc: block: Simplify cleaning up the queue New patch. mmc: block: Use data timeout in card_busy_detect() New patch. mmc: block: Check for transfer state in card_busy_detect() New patch. mmc: block: Make card_busy_detect() accumulate all response error bits New patch. mmc: core: Make mmc_pre_req() and mmc_post_req() available New patch. mmc: core: Add parameter use_blk_mq Default to y mmc: block: Add blk-mq support Wrap blk_mq_end_request / blk_end_request_all Rename mmc_blk_rw_recovery -> mmc_blk_mq_rw_recovery Additional parentheses to '==' expressions Use mmc_pre_req() / mmc_post_req() Fix missing tuning release on error after mmc_start_request() Expand comment about timeouts Allow for possibility that the queue is quiesced when removing Ensure complete_work is flushed when removing mmc: block: Add CQE support Additional parentheses to '==' expressions mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy Replaces patch "Stop using card_busy_detect()" retaining card_busy_detect() mmc: block: blk-mq: Stop using legacy recovery Allow for SPI mmc: mmc_test: Do not use mmc_start_areq() anymore New patch. mmc: core: Remove option not to use blk-mq New patch. mmc: block: Remove code no longer needed after the switch to blk-mq New patch. mmc: core: Remove code no longer needed after the switch to blk-mq New patch. Changes since V12: mmc: block: Add error-handling comments New patch. mmc: block: Add blk-mq support Use legacy error handling mmc: block: Add CQE support Re-base mmc: block: blk-mq: Add support for direct completion New patch. mmc: block: blk-mq: Separate card polling from recovery New patch. mmc: block: blk-mq: Stop using card_busy_detect() New patch. mmc: block: blk-mq: Stop using legacy recovery New patch. Changes since V11: Split "mmc: block: Add CQE and blk-mq support" into 2 patches Changes since V10: mmc: core: Remove unnecessary host claim mmc: core: Introduce host claiming by context mmc: core: Add support for handling CQE requests mmc: mmc: Enable Command Queuing mmc: mmc: Enable CQE's mmc: block: Use local variables in mmc_blk_data_prep() mmc: block: Prepare CQE data mmc: block: Factor out mmc_setup_queue() mmc: core: Add parameter use_blk_mq mmc: core: Export mmc_start_bkops() mmc: core: Export mmc_start_request() mmc: core: Export mmc_retune_hold_now() and mmc_retune_release() Dropped because they have been applied mmc: block: Add CQE and blk-mq support Extend blk-mq support for asynchronous read / writes to all host controllers including those that require polling. The direct completion path is still available but depends on a new capability flag. Drop blk-mq support for synchronous read / writes. Venkat Gopalakrishnan (1): mmc: cqhci: support for command queue enabled host Changes since V9: mmc: block: Add CQE and blk-mq support - reinstate mq support for REQ_OP_DRV_IN/OUT that was removed because it was incorrectly assumed to be handled by the rpmb character device - don't check for rpmb block device anymore mmc: cqhci: support for command queue enabled host Fix cqhci_set_irqs() as per Haibo Chen Changes since V8: Re-based mmc: core: Introduce host claiming by context Slightly simplified as per Ulf mmc: core: Export mmc_retune_hold_now() and mmc_retune_release() New patch. mmc: block: Add CQE and blk-mq support Fix missing ->post_req() on the error path Changes since V7: Re-based mmc: core: Introduce host claiming by context Slightly simplified mmc: core: Add parameter use_blk_mq New patch. mmc: core: Remove unnecessary host claim New patch. mmc: core: Export mmc_start_bkops() New patch. mmc: core: Export mmc_start_request() New patch. mmc: block: Add CQE and blk-mq support Add blk-mq support for non_CQE requests Changes since V6: mmc: core: Introduce host claiming by context New patch. mmc: core: Move mmc_start_areq() declaration Dropped because it has been applied mmc: block: Fix block status codes Dropped because it has been applied mmc: host: Add CQE interface Dropped because it has been applied mmc: core: Turn off CQE before sending commands Dropped because it has been applied mmc: block: Factor out mmc_setup_queue() New patch. mmc: block: Add CQE support Drop legacy support and add blk-mq support Changes since V5: Re-based mmc: core: Add mmc_retune_hold_now() Dropped because it has been applied mmc: core: Add members to mmc_request and mmc_data for CQE's Dropped because it has been applied mmc: core: Move mmc_start_areq() declaration New patch at Ulf's request mmc: block: Fix block status codes Another un-related patch mmc: host: Add CQE interface Move recovery_notifier() callback to struct mmc_request mmc: core: Add support for handling CQE requests Roll __mmc_cqe_request_done() into mmc_cqe_request_done() Move function declarations requested by Ulf mmc: core: Remove unused MMC_CAP2_PACKED_CMD Dropped because it has been applied mmc: block: Add CQE support Add explanation to commit message Adjustment for changed recovery_notifier() callback mmc: cqhci: support for command queue enabled host Adjustment for changed recovery_notifier() callback mmc: sdhci-pci: Add CQHCI support for Intel GLK Add DCMD capability for Intel controllers except GLK Changes since V4: mmc: core: Add mmc_retune_hold_now() Add explanation to commit message. mmc: host: Add CQE interface Add comments to callback declarations. mmc: core: Turn off CQE before sending commands Add explanation to commit message. mmc: core: Add support for handling CQE requests Add comments as requested by Ulf. mmc: core: Remove unused MMC_CAP2_PACKED_CMD New patch. mmc: mmc: Enable Command Queuing Adjust for removal of MMC_CAP2_PACKED_CMD. Add a comment about Packed Commands. mmc: mmc: Enable CQE's Remove un-necessary check for MMC_CAP2_CQE mmc: block: Use local variables in mmc_blk_data_prep() New patch. mmc: block: Prepare CQE data Adjust due to "mmc: block: Use local variables in mmc_blk_data_prep()" Remove priority setting. Add explanation to commit message. mmc: cqhci: support for command queue enabled host Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA Changes since V3: Adjusted ...blk_end_request...() for new block status codes Fixed CQHCI transaction descriptor for "no DCMD" case Changes since V2: Dropped patches that have been applied. Re-based Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK" Changes since V1: "Share mmc request array between partitions" is dependent on changes in "Introduce queue semantics", so added that and block fixes: Added "Fix is_waiting_last_req set incorrectly" Added "Fix cmd error reset failure path" Added "Use local var for mqrq_cur" Added "Introduce queue semantics" Changes since RFC: Re-based on next. Added comment about command queue priority. Added some acks and reviews. Adrian Hunter (9): mmc: core: Add parameter use_blk_mq mmc: block: Add error-handling comments mmc: block: Add blk-mq support mmc: block: Add CQE support mmc: sdhci-pci: Add CQHCI support for Intel GLK mmc: block: blk-mq: Add support for direct completion mmc: block: blk-mq: Separate card polling from recovery mmc: block: blk-mq: Stop using card_busy_detect() mmc: block: blk-mq: Stop using legacy recovery Venkat Gopalakrishnan (1): mmc: cqhci: support for command queue enabled host drivers/mmc/Kconfig | 11 + drivers/mmc/core/block.c | 850 ++++++++++++++++++++++++++- drivers/mmc/core/block.h | 12 + drivers/mmc/core/core.c | 7 + drivers/mmc/core/core.h | 2 + drivers/mmc/core/host.c | 2 + drivers/mmc/core/host.h | 4 + drivers/mmc/core/queue.c | 426 +++++++++++++- drivers/mmc/core/queue.h | 56 ++ drivers/mmc/host/Kconfig | 14 + drivers/mmc/host/Makefile | 1 + drivers/mmc/host/cqhci.c | 1150 +++++++++++++++++++++++++++++++++++++ drivers/mmc/host/cqhci.h | 240 ++++++++ drivers/mmc/host/sdhci-pci-core.c | 155 ++++- include/linux/mmc/host.h | 2 + 15 files changed, 2900 insertions(+), 32 deletions(-) create mode 100644 drivers/mmc/host/cqhci.c create mode 100644 drivers/mmc/host/cqhci.h Adrian Hunter (4): mmc: core: Add parameter use_blk_mq mmc: block: Add blk-mq support mmc: block: Add CQE support mmc: sdhci-pci: Add CQHCI support for Intel GLK Venkat Gopalakrishnan (1): mmc: cqhci: support for command queue enabled host drivers/mmc/Kconfig | 11 + drivers/mmc/core/block.c | 801 +++++++++++++++++++++++++- drivers/mmc/core/block.h | 12 + drivers/mmc/core/core.c | 7 + drivers/mmc/core/core.h | 2 + drivers/mmc/core/host.c | 2 + drivers/mmc/core/host.h | 4 + drivers/mmc/core/queue.c | 426 +++++++++++++- drivers/mmc/core/queue.h | 56 ++ drivers/mmc/host/Kconfig | 14 + drivers/mmc/host/Makefile | 1 + drivers/mmc/host/cqhci.c | 1150 +++++++++++++++++++++++++++++++++++++ drivers/mmc/host/cqhci.h | 240 ++++++++ drivers/mmc/host/sdhci-pci-core.c | 155 ++++- include/linux/mmc/host.h | 2 + 15 files changed, 2852 insertions(+), 31 deletions(-) create mode 100644 drivers/mmc/host/cqhci.c create mode 100644 drivers/mmc/host/cqhci.h Adrian Hunter (23): mmc: block: Fix missing blk_put_request() mmc: block: Check return value of blk_get_request() mmc: core: Do not leave the block driver in a suspended state mmc: block: Ensure that debugfs files are removed mmc: block: No need to export mmc_cleanup_queue() mmc: block: Simplify cleaning up the queue mmc: block: Use data timeout in card_busy_detect() mmc: block: Check for transfer state in card_busy_detect() mmc: block: Make card_busy_detect() accumulate all response error bits mmc: core: Make mmc_pre_req() and mmc_post_req() available mmc: block: Add error-handling comments mmc: core: Add parameter use_blk_mq mmc: block: Add blk-mq support mmc: block: Add CQE support mmc: sdhci-pci: Add CQHCI support for Intel GLK mmc: block: blk-mq: Add support for direct completion mmc: block: blk-mq: Separate card polling from recovery mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy mmc: block: blk-mq: Stop using legacy recovery mmc: mmc_test: Do not use mmc_start_areq() anymore mmc: core: Remove option not to use blk-mq mmc: block: Remove code no longer needed after the switch to blk-mq mmc: core: Remove code no longer needed after the switch to blk-mq Venkat Gopalakrishnan (1): mmc: cqhci: support for command queue enabled host drivers/mmc/core/block.c | 1396 +++++++++++++++++++++---------------- drivers/mmc/core/block.h | 12 +- drivers/mmc/core/bus.c | 5 +- drivers/mmc/core/core.c | 216 +----- drivers/mmc/core/core.h | 39 +- drivers/mmc/core/debugfs.c | 1 + drivers/mmc/core/host.h | 4 + drivers/mmc/core/mmc_test.c | 122 ++-- drivers/mmc/core/queue.c | 493 ++++++++----- drivers/mmc/core/queue.h | 69 +- drivers/mmc/host/Kconfig | 14 + drivers/mmc/host/Makefile | 1 + drivers/mmc/host/cqhci.c | 1150 ++++++++++++++++++++++++++++++ drivers/mmc/host/cqhci.h | 240 +++++++ drivers/mmc/host/sdhci-pci-core.c | 155 +++- include/linux/mmc/host.h | 5 +- 16 files changed, 2840 insertions(+), 1082 deletions(-) create mode 100644 drivers/mmc/host/cqhci.c create mode 100644 drivers/mmc/host/cqhci.h Regards Adrian