On Sat, Apr 18, 2020 at 11:09:21AM +0800, Ming Lei wrote: > -static bool blk_mq_get_driver_tag(struct request *rq) > +static bool blk_mq_get_driver_tag(struct request *rq, bool direct_issue) > { > struct blk_mq_alloc_data data = { > .q = rq->q, > @@ -1054,6 +1054,23 @@ static bool blk_mq_get_driver_tag(struct request *rq) > data.hctx->tags->rqs[rq->tag] = rq; > } > allocated: > + /* > + * Add one memory barrier in case that direct issue IO process > + * is migrated to other CPU which may not belong to this hctx, > + * so we can order driver tag assignment and checking > + * BLK_MQ_S_INACTIVE. Otherwise, barrier() is enough given the > + * two code paths are run on single CPU in case that > + * BLK_MQ_S_INACTIVE is set. Please use up all 80 characters for the comments (also elsewhere). That being said I fail to see what the barrier() as a pure compiler barrier even buys us here. > + */ > + if (unlikely(direct_issue && rq->mq_ctx->cpu != raw_smp_processor_id())) > + smp_mb(); > + else > + barrier(); > + > + if (unlikely(test_bit(BLK_MQ_S_INACTIVE, &data.hctx->state))) { > + blk_mq_put_driver_tag(rq); > + return false; > + } > return rq->tag != -1; Also if you take my cleanup to patch 2, we could just open code the direct_issue case in the only caller instead of having the magic in the common routine. > + if ((cpumask_next_and(-1, hctx->cpumask, cpu_online_mask) != cpu) || > + (cpumask_next_and(cpu, hctx->cpumask, cpu_online_mask) > + < nr_cpu_ids)) No need for the inner braces. Also in this case I think something like: if (cpumask_next_and(-1, hctx->cpumask, cpu_online_mask) != cpu || cpumask_next_and(cpu, hctx->cpumask, cpu_online_mask) < nr_cpu_ids) might be a tad more readable, but then again this might even be worth a little inline helper once we start bike shedding. > + /* > + * The current CPU is the last one in this hctx, S_INACTIVE > + * can be observed in dispatch path without any barrier needed, > + * cause both are run on one same CPU. > + */ > + set_bit(BLK_MQ_S_INACTIVE, &hctx->state); > + /* > + * Order setting BLK_MQ_S_INACTIVE and checking rq->tag & rqs[tag], > + * and its pair is the smp_mb() in blk_mq_get_driver_tag > + */ > + smp_mb(); > + blk_mq_hctx_drain_inflight_rqs(hctx); > + return 0; FYI, Documentation/core-api/atomic_ops.rst asks for using smp_mb__before_atomic / smp_mb__after_atomic around the bitops. > +static int blk_mq_hctx_notify_online(unsigned int cpu, struct hlist_node *node) > +{ > + struct blk_mq_hw_ctx *hctx = hlist_entry_safe(node, > + struct blk_mq_hw_ctx, cpuhp_online); > + > + if (!cpumask_test_cpu(cpu, hctx->cpumask)) > + return 0; > + > + clear_bit(BLK_MQ_S_INACTIVE, &hctx->state); > return 0; > } Why not simply: if (cpumask_test_cpu(cpu, hctx->cpumask)) clear_bit(BLK_MQ_S_INACTIVE, &hctx->state); return 0; Conceptually the changes look fine.