Re: [PATCH 2/4] block/mq-deadline: serialize request dispatching

Bart Van Assche <bvanassche@xxxxxxx> · Fri, 19 Jan 2024 15:24:40 -0800




On 1/19/24 08:02, Jens Axboe wrote:
+	/*
+	 * If someone else is already dispatching, skip this one. This will
+	 * defer the next dispatch event to when something completes, and could
+	 * potentially lower the queue depth for contended cases.
+	 *
+	 * See the logic in blk_mq_do_dispatch_sched(), which loops and
+	 * retries if nothing is dispatched.
+	 */
+	if (test_bit(DD_DISPATCHING, &dd->run_state) ||
+	    test_and_set_bit(DD_DISPATCHING, &dd->run_state))
+		return NULL;
+
  	spin_lock(&dd->lock);
  	rq = dd_dispatch_prio_aged_requests(dd, now);
  	if (rq)
@@ -616,6 +635,7 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
  	}
  
  unlock:
+	clear_bit(DD_DISPATCHING, &dd->run_state);
  	spin_unlock(&dd->lock);

From Documentation/memory-barriers.txt: "These are also used for atomic RMW
bitop functions that do not imply a memory barrier (such as set_bit and
clear_bit)." Does this mean that CPUs with a weak memory model (e.g. ARM)
are allowed to execute the clear_bit() call earlier than where it occurs in
the code? I think that spin_trylock() has "acquire" semantics and also that
"spin_unlock()" has release semantics. While a CPU is allowed to execute
clear_bit() before the memory operations that come before it, I don't think
that is the case for spin_unlock(). See also
tools/memory-model/Documentation/locking.txt.

Thanks,

Bart.