On 5/21/23 7:23?PM, Ming Lei wrote: > On Mon, May 22, 2023 at 10:15:22AM +0900, Damien Le Moal wrote: >> On 5/22/23 09:43, Tian Lan wrote: >>> From: Tian Lan <tian.lan@xxxxxxxxxxxx> >>> >>> If multiple CPUs are sharing the same hardware queue, it can >>> cause leak in the active queue counter tracking when __blk_mq_tag_busy() >>> is executed simultaneously. >>> >>> Fixes: ee78ec1077d3 ("blk-mq: blk_mq_tag_busy is no need to return a value") >>> Signed-off-by: Tian Lan <tian.lan@xxxxxxxxxxxx> >>> --- >>> block/blk-mq-tag.c | 10 ++++++---- >>> 1 file changed, 6 insertions(+), 4 deletions(-) >>> >>> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c >>> index d6af9d431dc6..07372032238a 100644 >>> --- a/block/blk-mq-tag.c >>> +++ b/block/blk-mq-tag.c >>> @@ -42,13 +42,15 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) >>> if (blk_mq_is_shared_tags(hctx->flags)) { >>> struct request_queue *q = hctx->queue; >>> >>> - if (test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) >>> + if (test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags) || >>> + test_and_set_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) { >> >> This is weird. test_and_set_bit() returns the bit old value, so shouldn't this be: >> >> if (test_and_set_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) >> return; >> >> ? > > It is one micro optimization since test_and_set_bit is much heavier > than test_bit, so test_and_set_bit() is just needed in the 1st time. It's an optimization, but it's certainly not a micro one. If the common case is always hitting that, then test_and_set_bit() will repeatedly dirty that cacheline. And obviously it's useless to do that if the bit is already set. This makes it a pretty nice optimization and definitely outside the realm of "micro optimization" as it can have quite a large impact. I used that in various spots in blk-mq, which I suspect is where the inspiration for this one came too. -- Jens Axboe