On Mon, May 22, 2023 at 10:29:05AM +0900, Damien Le Moal wrote: > On 5/22/23 10:23, Ming Lei wrote: > > On Mon, May 22, 2023 at 10:15:22AM +0900, Damien Le Moal wrote: > >> On 5/22/23 09:43, Tian Lan wrote: > >>> From: Tian Lan <tian.lan@xxxxxxxxxxxx> > >>> > >>> If multiple CPUs are sharing the same hardware queue, it can > >>> cause leak in the active queue counter tracking when __blk_mq_tag_busy() > >>> is executed simultaneously. > >>> > >>> Fixes: ee78ec1077d3 ("blk-mq: blk_mq_tag_busy is no need to return a value") > >>> Signed-off-by: Tian Lan <tian.lan@xxxxxxxxxxxx> > >>> --- > >>> block/blk-mq-tag.c | 10 ++++++---- > >>> 1 file changed, 6 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c > >>> index d6af9d431dc6..07372032238a 100644 > >>> --- a/block/blk-mq-tag.c > >>> +++ b/block/blk-mq-tag.c > >>> @@ -42,13 +42,15 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) > >>> if (blk_mq_is_shared_tags(hctx->flags)) { > >>> struct request_queue *q = hctx->queue; > >>> > >>> - if (test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) > >>> + if (test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags) || > >>> + test_and_set_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) { > >> > >> This is weird. test_and_set_bit() returns the bit old value, so shouldn't this be: > >> > >> if (test_and_set_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) > >> return; > >> > >> ? > > > > It is one micro optimization since test_and_set_bit is much heavier than > > test_bit, so test_and_set_bit() is just needed in the 1st time. > > But having the 2 calls test_bit + test_and_set_bit creates a race, doesn't it ? > What if another cpu clears the bit between these 2 calls ? If test_bit() returns 0, there isn't such race since both sides are atomic OP. If test_bit() returns 1: 1) __blk_mq_tag_busy() vs. __blk_mq_tag_busy() - both does nothing 2) __blk_mq_tag_busy() vs. __blk_mq_tag_idle() - hctx_may_queue() is always following __blk_mq_tag_busy() - hctx_may_queue() returns true in case that this flag is cleared - current __blk_mq_tag_busy() does nothing, the following allocation works fine given hctx_may_queue() returns true - new __blk_mq_tag_busy() will setup everything as fine Thanks, Ming