On Fri, May 08, 2020 at 04:39:46PM -0700, Bart Van Assche wrote: > On 2020-05-04 19:09, Ming Lei wrote: > > -static bool blk_mq_get_driver_tag(struct request *rq) > > +static bool blk_mq_get_driver_tag(struct request *rq, bool direct_issue) > > { > > if (rq->tag != -1) > > return true; > > - return __blk_mq_get_driver_tag(rq); > > + > > + if (!__blk_mq_get_driver_tag(rq)) > > + return false; > > + /* > > + * In case that direct issue IO process is migrated to other CPU > > + * which may not belong to this hctx, add one memory barrier so we > > + * can order driver tag assignment and checking BLK_MQ_S_INACTIVE. > > + * Otherwise, barrier() is enough given both setting BLK_MQ_S_INACTIVE > > + * and driver tag assignment are run on the same CPU because > > + * BLK_MQ_S_INACTIVE is only set after the last CPU of this hctx is > > + * becoming offline. > > + * > > + * Process migration might happen after the check on current processor > > + * id, smp_mb() is implied by processor migration, so no need to worry > > + * about it. > > + */ > > + if (unlikely(direct_issue && rq->mq_ctx->cpu != raw_smp_processor_id())) > > + smp_mb(); > > + else > > + barrier(); > > + > > + if (unlikely(test_bit(BLK_MQ_S_INACTIVE, &rq->mq_hctx->state))) { > > + blk_mq_put_driver_tag(rq); > > + return false; > > + } > > + return true; > > } > > How much does this patch slow down the hot path? Basically zero cost is added to hot path, exactly: > + if (unlikely(direct_issue && rq->mq_ctx->cpu != raw_smp_processor_id())) In case of direct issue, chance of the io process migration is very small, since basically direct issue follows request allocation and the time is quite small, so smp_mb() won't be run most of times. > + smp_mb(); > + else > + barrier(); So barrier() is added most of times, however the effect can be ignored since it is just a compiler barrier. > + > + if (unlikely(test_bit(BLK_MQ_S_INACTIVE, &rq->mq_hctx->state))) { hctx->state is always checked in hot path, so basically zero cost. > + blk_mq_put_driver_tag(rq); > + return false; > + } > > Can CPU migration be fixed without affecting the hot path, e.g. by using > the request queue freezing mechanism? Why do we want to fix CPU migration of direct issue IO process? It may not be necessary or quite difficultly: 1) preempt disable is removed previously in cleanup patch since request is allocated 2) we have drivers which may set BLOCKING, so .queue_rq() may sleep Not sure why you mention queue freezing. Thanks, Ming