On Mon, May 03, 2021 at 06:22:13PM +0800, John Garry wrote: > The tags used for an IO scheduler are currently per hctx. > > As such, when q->nr_hw_queues grows, so does the request queue total IO > scheduler tag depth. > > This may cause problems for SCSI MQ HBAs whose total driver depth is > fixed. > > Ming and Yanhui report higher CPU usage and lower throughput in scenarios > where the fixed total driver tag depth is appreciably lower than the total > scheduler tag depth: > https://lore.kernel.org/linux-block/440dfcfc-1a2c-bd98-1161-cec4d78c6dfc@xxxxxxxxxx/T/#mc0d6d4f95275a2743d1c8c3e4dc9ff6c9aa3a76b > No difference any more wrt. fio running on scsi_debug with this patch in Yanhui's test machine: modprobe scsi_debug host_max_queue=128 submit_queues=32 virtual_gb=256 delay=1 vs. modprobe scsi_debug max_queue=128 submit_queues=1 virtual_gb=256 delay=1 Without this patch, the latter's result is 30% higher than the former's. note: scsi_debug's queue depth needs to be updated to 128 for avoiding io hang, which is another scsi issue. Thanks, Ming