Re: [PATCH 2/4] block/mq-deadline: serialize request dispatching

Jens Axboe <axboe@xxxxxxxxx> · Fri, 19 Jan 2024 17:00:40 -0700

On 1/19/24 4:24 PM, Bart Van Assche wrote:
> On 1/19/24 08:02, Jens Axboe wrote:
>> +    /*
>> +     * If someone else is already dispatching, skip this one. This will
>> +     * defer the next dispatch event to when something completes, and could
>> +     * potentially lower the queue depth for contended cases.
>> +     *
>> +     * See the logic in blk_mq_do_dispatch_sched(), which loops and
>> +     * retries if nothing is dispatched.
>> +     */
>> +    if (test_bit(DD_DISPATCHING, &dd->run_state) ||
>> +        test_and_set_bit(DD_DISPATCHING, &dd->run_state))
>> +        return NULL;
>> +
>>       spin_lock(&dd->lock);
>>       rq = dd_dispatch_prio_aged_requests(dd, now);
>>       if (rq)
>> @@ -616,6 +635,7 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
>>       }
>>     unlock:
>> +    clear_bit(DD_DISPATCHING, &dd->run_state);
>>       spin_unlock(&dd->lock);
> 
> From Documentation/memory-barriers.txt: "These are also used for atomic RMW
> bitop functions that do not imply a memory barrier (such as set_bit and
> clear_bit)." Does this mean that CPUs with a weak memory model (e.g. ARM)
> are allowed to execute the clear_bit() call earlier than where it occurs in
> the code? I think that spin_trylock() has "acquire" semantics and also that
> "spin_unlock()" has release semantics. While a CPU is allowed to execute
> clear_bit() before the memory operations that come before it, I don't think
> that is the case for spin_unlock(). See also
> tools/memory-model/Documentation/locking.txt.

Not sure why I didn't do it upfront, but they just need to be the _lock
variants of the bitops. I'll make that change.

-- 
Jens Axboe