Re: [RFC PATCH] blk-mq: implement queue quiesce via percpu_ref for BLK_MQ_F_BLOCKING

Ming Lei <ming.lei@xxxxxxxxxx> · Fri, 31 Jul 2020 08:28:38 +0800

On Thu, Jul 30, 2020 at 09:10:48AM -0700, Sagi Grimberg wrote:
> 
> > > > > > > In case of BLK_MQ_F_BLOCKING, blk-mq uses SRCU to mark read critical
> > > > > > > section during dispatching request, then request queue quiesce is based on
> > > > > > > SRCU. What we want to get is low cost added in fast path.
> > > > > > > 
> > > > > > > However, from srcu_read_lock/srcu_read_unlock implementation, not see
> > > > > > > it is quicker than percpu refcount, so use percpu_ref to implement
> > > > > > > queue quiesce. This usage is cleaner and simpler & enough for implementing
> > > > > > > queue quiesce. The main requirement is to make sure all read sections to observe
> > > > > > > QUEUE_FLAG_QUIESCED once blk_mq_quiesce_queue() returns.
> > > > > > > 
> > > > > > > Also it becomes much easier to add interface of async queue quiesce.
> > > > > > 
> > > > > > BTW, no obvious IOPS difference is observed with this patch applied when running
> > > > > > io on null_blk(blocking, submit_queues=32) in one dual-socket, 32cores system.
> > > > > 
> > > > > Thanks Ming, can you test for non-blocking on the same setup?
> > > > 
> > > > OK, I can do that, but this patch supposes to not affect non-blocking,
> > > > care to share your motivation for testing non-blocking?
> > > 
> > > I think it will be a significant improvement to have a single code path.
> > > The code will be more robust and we won't need to face issues that are
> > > specific for blocking.
> > > 
> > > If the cost is negligible, I think the upside is worth it.
> > > 
> > 
> > rcu_read_lock and rcu_read_unlock has been proved as efficient enough,
> > and I don't think percpu_refcount is better than it, so I'd suggest to
> > not switch non-blocking into this way.
> 
> It's not a matter of which is better, its a matter of making the code
> more robust because it has a single code-path. If moving to percpu_ref
> is negligible, I would suggest to move both, I don't want to have two
> completely different mechanism for blocking vs. non-blocking.

RCU and SRCU have been different mechanism already.

> 
> > BTW, in case of blocking, one hctx may dispatch at most one request because there
> > is only single .run_work, even though when .queue_rq() is slept, that said
> > blk_mq_submit_bio() queues bio in sync style. This way won't be very efficient.
> > So percpu_refcount should be good enough for blocking code path, but may not be
> > well enough for non-blocking case.
> 
> Not sure what you mean, the percpu_ref is taken exactly where rcu is
> taken, not sure what is the difference.

My point is that blocking can't be efficient enough, when .queue_rq() is
slept, no any request can be queued to this hctx any more.

thanks,
Ming