On Fri, May 08, 2020 at 08:24:44PM -0700, Bart Van Assche wrote: > On 2020-05-08 19:20, Ming Lei wrote: > > Not sure why you mention queue freezing. > > This patch series introduces a fundamental race between modifying the > hardware queue state (BLK_MQ_S_INACTIVE) and tag allocation. The only Basically there are two cases: 1) setting BLK_MQ_S_INACTIVE and driver tag allocation are run on same CPU, we just need a compiler barrier, that happens most of times 2) setting BLK_MQ_S_INACTIVE and driver tag allocation are run on different CPUs, then one pair of smp_mb() is applied for avoiding out of order, that only happens in case of direct issue process migration. Please take a look at the comment in this patch: + /* + * In case that direct issue IO process is migrated to other CPU + * which may not belong to this hctx, add one memory barrier so we + * can order driver tag assignment and checking BLK_MQ_S_INACTIVE. + * Otherwise, barrier() is enough given both setting BLK_MQ_S_INACTIVE + * and driver tag assignment are run on the same CPU because + * BLK_MQ_S_INACTIVE is only set after the last CPU of this hctx is + * becoming offline. + * + * Process migration might happen after the check on current processor + * id, smp_mb() is implied by processor migration, so no need to worry + * about it. + */ And you may find more discussion about this topic in the following thread: https://lore.kernel.org/linux-block/20200429134327.GC700644@T590/ > mechanism I know of for enforcing the order in which another thread > observes writes to different memory locations without inserting a memory > barrier in the hot path is RCU (see also The RCU-barrier menagerie; > https://lwn.net/Articles/573497/). The only existing such mechanism in > the blk-mq core I know of is queue freezing. Hence my comment about > queue freezing. You didn't explain how queue freezing is used for this issue. We are talking about CPU hotplug vs. IO. In short, when one hctx becomes inactive(all cpus in hctx->cpumask becomes offline), in-flight IO from this hctx needs to be drained for avoiding io timeout. Also all requests in scheduler/sw queue from this hctx needs to be handled correctly for avoiding IO hang. queue freezing can only be applied on the request queue level, and not hctx level. When requests can't be completed, wait freezing just hangs for-ever. Thanks, Ming