Re: [PATCH v5 1/2] blk-mq: add tagset quiesce interface

Chao Leng <lengchao@xxxxxxxxxx> · Fri, 7 Aug 2020 17:35:54 +0800

On 2020/8/7 17:24, Ming Lei wrote:
On Fri, Aug 07, 2020 at 05:04:38PM +0800, Chao Leng wrote:

On 2020/7/29 12:39, Sagi Grimberg wrote:

Dynamically allocating each one is possible but not very scalable.

The question is if there is some way, we can do this with on-stack
or a single on-heap rcu_head or equivalent that can achieve the same
effect.

If the hctx structures are guaranteed to stay put, you could count
them and then do a single allocation of an array of rcu_head structures
(or some larger structure containing an rcu_head structure, if needed).
You could then sequence through this array, consuming one rcu_head per
hctx as you processed it.  Once all the callbacks had been invoked,
it would be safe to free the array.

Sounds too simple, though.  So what am I missing?

We don't want higher-order allocations...

So:

    (1) We don't want to embed the struct in the hctx because we allocate
    so many of them that this is non-negligable to add for something we
    typically never use.

    (2) We don't want to allocate dynamically because it's potentially
    huge.

As long as we're using srcu for blocking hctx's, I think it's "pick your
poison".

Alternatively, Ming's percpu_ref patch(*) may be worth a look.

   * https://www.spinics.net/lists/linux-block/msg56976.html1
I'm not opposed to having this. Will require some more testing
as this affects pretty much every driver out there..

If we are going with a lightweight percpu_ref, can we just do
it also for non-blocking hctx and have a single code-path?
.
I tried to optimize the patch，support for non blocking queue and
blocking queue.
See next email.

Please see the following thread:

https://lore.kernel.org/linux-block/05f75e89-b6f7-de49-eb9f-a08aa4e0ba4f@xxxxxxxxx/

Both Keith and Jens didn't think it is a good idea.
If we can support nonblocking queue and blocking queue simplely, this may be a good choice.
Please review the patch first.

Thanks,
Ming

.