Re: [PATCH 6/6] blk-mq: stop to allocate new requst and drain request before hctx becomes inactive

Ming Lei <ming.lei@xxxxxxxxxx> · Fri, 15 May 2020 13:20:10 +0800

On Thu, May 14, 2020 at 08:55:34PM -0700, Bart Van Assche wrote:
> On 2020-05-14 18:41, Ming Lei wrote:
> > +	/* Prevent new request from being allocated on the current hctx/cpu */
> > +	set_bit(BLK_MQ_S_INACTIVE, &hctx->state);
> 
> What guarantees that this change has been observed by all CPUs before
> the blk_mq_tags_has_request() loop finishes?

We don't need all CPUs to observe this INACTIVE flag, and just need
allocation from the current CPU to observe this flag.

The current CPU is the last online CPU of this hctx, so:

1) any requests which are allocated from this cpu will either be drained
by blk_mq_tags_has_request(), or allocated remotely, see blk_mq_get_request().

2) for any requests which are allocated from other offline CPUs of this
hctx, their tag bits have been committed to memory before that cpu offline,
so the current cpu(blk_mq_hctx_notify_offline) can observe their tags bit
reliably, also each cpu offline handling is strictly serialized.

> 
> > +	/*
> > +	 * Grab one refcount for avoiding scheduler switch, and
> > +	 * return immediately if queue has been frozen.
> > +	 */
> > +	if (!percpu_ref_tryget(&q->q_usage_counter))
> > +		return 0;
> 
> If percpu_ref_tryget(&q->q_usage_counter) fails that means either that
> request queue freezing is in progress or that a request queue has been
> frozen. I don't think that it is safe to return immediately if request
> queue freezing is still in progress.

static inline bool percpu_ref_tryget(struct percpu_ref *ref)
{
        unsigned long __percpu *percpu_count;
        bool ret;

        rcu_read_lock();

        if (__ref_is_percpu(ref, &percpu_count)) {
                this_cpu_inc(*percpu_count);
                ret = true;
        } else {
                ret = atomic_long_inc_not_zero(&ref->count);
        }       

        rcu_read_unlock();

        return ret;
}

If percpu_ref_tryget returns false, that means the atomic refcount
has become zero, so the request queue has been frozen, and we are safe
to return. Or my understanding on atomic_long_inc_not_zero() is wrong? :-)

Thanks,
Ming