On Sat, Apr 25, 2020 at 05:34:37PM +0800, Ming Lei wrote: > On Sat, Apr 25, 2020 at 10:32:24AM +0200, Christoph Hellwig wrote: > > On Sat, Apr 25, 2020 at 11:17:23AM +0800, Ming Lei wrote: > > > I am not sure if it helps by adding two helper, given only two > > > parameters are needed, and the new parameter is just a constant. > > > > > > > the point of barrier(), smp_mb__before_atomic and > > > > smp_mb__after_atomic), as none seems to be addressed and I also didn't > > > > see a reply. > > > > > > I believe it has been documented: > > > > > > + /* > > > + * Add one memory barrier in case that direct issue IO process is > > > + * migrated to other CPU which may not belong to this hctx, so we can > > > + * order driver tag assignment and checking BLK_MQ_S_INACTIVE. > > > + * Otherwise, barrier() is enough given both setting BLK_MQ_S_INACTIVE > > > + * and driver tag assignment are run on the same CPU in case that > > > + * BLK_MQ_S_INACTIVE is set. > > > + */ > > > > > > OK, I can add more: > > > > > > In case of not direct issue, __blk_mq_delay_run_hw_queue() guarantees > > > that dispatch is done on CPUs of this hctx. > > > > > > In case of direct issue, the direct issue IO process may be migrated to > > > other CPU which doesn't belong to hctx->cpumask even though the chance > > > is quite small, but still possible. > > > > > > This patch sets hctx as inactive in the last CPU of hctx, so barrier() > > > is enough for not direct issue. Otherwise, one smp_mb() is added for > > > ordering tag assignment(include setting rq) and checking S_INACTIVE in > > > blk_mq_get_driver_tag(). > > > > How do you prevent a cpu migration between the call to raw_smp_processor_id > > and barrier? > > Fine, we may have to use get_cpu()/put_cpu() for direct issue to cover > the check. For non-direct issue, either preempt is disabled or the work is > run on specified CPU. > > > > > Also as far as I can tell Documentation/core-api/atomic_ops.rst ask > > you to use smp_mb__before_atomic and smp_mb__after_atomic for any > > ordering with non-updating bitops. Quote: > > > > --------------------------------- snip --------------------------------- > > If explicit memory barriers are required around {set,clear}_bit() (which do > > not return a value, and thus does not need to provide memory barrier > > semantics), two interfaces are provided:: > > > > void smp_mb__before_atomic(void); > > void smp_mb__after_atomic(void); > > --------------------------------- snip --------------------------------- > > > > I really want someone who knows the memory model to look over this scheme, > > as it looks dangerous. > > smp_mb() is enough, the version of _[before|after]_atomic might be > a little lightweight in some ARCH I guess. Given smp_mb() or its variant > is only needed in very unlikely case of slow path, it is fine to just s/of/or Thanks, Ming