On Tue, Dec 14, 2021 at 12:31:23AM +0000, Dexuan Cui wrote: > > From: Ming Lei <ming.lei@xxxxxxxxxx> > > Sent: Sunday, December 12, 2021 11:38 PM > > Ming, thanks so much for the detailed analysis! > > > From the log: > > > > 1) dm-mpath: > > - queue depth: 2048 > > - busy: 848, and 62 of them are in sw queue, so run queue is often > > caused > > - nr_hw_queues: 1 > > - dm-2 is in use, and dm-1/dm-3 is idle > > - dm-2's dispatch busy is 8, that should be the reason why excessive CPU > > usage is observed when flushing plug list without commit dc5fc361d891 in > > which hctx->dispatch_busy is just bypassed > > > > 2) iscsi > > - dispatch_busy is 0 > > - nr_hw_queues: 1 > > - queue depth: 113 > > - busy=~33, active_queues is 3, so each LUN/iscsi host is saturated > > - 23 active LUNs, 23 * 33 = 759 in-flight commands > > > > The high CPU utilization may be caused by: > > > > 1) big queue depth of dm mpath, the situation may be improved much if it > > is reduced to 1024 or 800. The max allowed inflight commands from iscsi > > hosts can be figured out, if dm's queue depth is much more than this number, > > the extra commands need to dispatch, and run queue can be scheduled > > immediately, so high CPU utilization is caused. > > I think you're correct: > with dm_mod.dm_mq_queue_depth=256, the max CPU utilization is 8%. > with dm_mod.dm_mq_queue_depth=400, the max CPU utilization is 12%. > with dm_mod.dm_mq_queue_depth=800, the max CPU utilization is 88%. > > The performance with queue_depth=800 is poor. > The performance with queue_depth=400 is good. > The performance with queue_depth=256 is also good, and there is only a > small drop comared with the 400 case. That should be the reason why the issue isn't triggered in case of real io scheduler. So far blk-mq doesn't provide way to adjust tags queue depth dynamically. But not understand reason of default dm_mq_queue_depth(2048), in this situation, each LUN can just queue 113/3 requests at most, and 3 LUNs are attached to single iscsi host. Mike, can you share why the default dm_mq_queue_depth is so big? And seems it doesn't consider the underlying queue's queue depth. What is the biggest dm rq queue depth? which need to saturate all underlying paths? > > > 2) single hw queue, so contention should be big, which should be avoided > > in big machine, nvme-tcp might be better than iscsi here > > > > 3) iscsi io latency is a bit big > > > > Even CPU utilization is reduced by commit dc5fc361d891, io performance > > can't be good too with v5.16-rc, I guess. > > > > Thanks, > > Ming > > Actually the I/O performance of v5.16-rc4 (commit dc5fc361d891 is included) > is good -- it's about the same as the case where v5.16-rc4 + reverting > dc5fc361d891 + dm_mod.dm_mq_queue_depth=400 (or 256). The single hw queue may be the root cause of your issue, and there is only single run_work, which can be touched by all CPUs(~200) almost, so cache ping-pong could be very serious. Jens patch may improve it more or less, please test it. Thanks, Ming -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel