From: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> v4: - Rebase on the updated block/master branch, which include a flush bugfix from Christoph. Please help to check patch 04. Thanks! - Add a bugfix patch 02 for post-flush requests, put before other flush optimizations. - Collect Reviewed-by tags from Ming and Christoph. Thanks! - [v3] https://lore.kernel.org/lkml/20230707093722.1338589-1-chengming.zhou@xxxxxxxxx/ v3: - Collect Reviewed-by tags from Ming and Christoph. Thanks! - Remove the list and csd variables which are only used once. - Fix a bug report of blktests nvme/012 by re-initialization of rq->queuelist, which maybe corrupted by rq->rq_next reuse. - [v2] https://lore.kernel.org/all/20230629110359.1111832-1-chengming.zhou@xxxxxxxxx/ v2: - Change to use call_single_data_t, which use __aligned() to avoid to use 2 cache lines for 1 csd. Thanks Ming Lei. - [v1] https://lore.kernel.org/all/20230627120854.971475-1-chengming.zhou@xxxxxxxxx/ Hello, After the commit be4c427809b0 ("blk-mq: use the I/O scheduler for writes from the flush state machine"), rq->flush can't reuse rq->elv anymore, since flush_data requests can go into io scheduler now. That increased the size of struct request by 24 bytes, but this patchset can decrease the size by 40 bytes, which is good I think. patch 1 use percpu csd to do remote complete instead of per-rq csd, decrease the size by 24 bytes. patch 2 fixes a bug in blk-flush for post-flush requests. patch 3-4 reuse rq->queuelist in flush state machine pending list, and maintain unsigned long counter of inflight flush_data requests, decrease the size by 16 bytes. Thanks for comments! Chengming Zhou (4): blk-mq: use percpu csd to remote complete instead of per-rq csd blk-flush: fix rq->flush.seq for post-flush requests blk-flush: count inflight flush_data requests blk-flush: reuse rq queuelist in flush state machine block/blk-flush.c | 26 +++++++++++++++----------- block/blk-mq.c | 12 ++++++------ block/blk.h | 5 ++--- include/linux/blk-mq.h | 6 +----- 4 files changed, 24 insertions(+), 25 deletions(-) -- 2.41.0