On Fri, Mar 04, 2022 at 04:35:54PM -0800, Andrew Morton wrote: > On Fri, 4 Mar 2022 13:29:31 -0300 Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > > > > > On systems that run FIFO:1 applications that busy loop > > on isolated CPUs, executing tasks on such CPUs under > > lower priority is undesired (since that will either > > hang the system, or cause longer interruption to the > > FIFO task due to execution of lower priority task > > with very small sched slices). > > > > Commit d479960e44f27e0e52ba31b21740b703c538027c ("mm: disable LRU > > pagevec during the migration temporarily") relies on > > queueing work items on all online CPUs to ensure visibility > > of lru_disable_count. > > > > However, its possible to use synchronize_rcu which will provide the same > > guarantees (see comment this patch modifies on lru_cache_disable). > > > > Fixes: > > > > ... > > > > --- a/mm/swap.c > > +++ b/mm/swap.c > > @@ -831,8 +831,7 @@ inline void __lru_add_drain_all(bool force_all_cpus) > > for_each_online_cpu(cpu) { > > struct work_struct *work = &per_cpu(lru_add_drain_work, cpu); > > > > - if (force_all_cpus || > > - pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) || > > + if (pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) || > > Please changelog this alteration? It should be now. Are you OK with this changelog ? (if not, please let me know what should be improved). On systems that run FIFO:1 applications that busy loop, any SCHED_OTHER task that attempts to execute on such a CPU (such as work threads) will not be scheduled, which leads to system hangs. Commit d479960e44f27e0e52ba31b21740b703c538027c ("mm: disable LRU pagevec during the migration temporarily") relies on queueing work items on all online CPUs to ensure visibility of lru_disable_count. To fix this, replace the usage of work items with synchronize_rcu, which provides the same guarantees: Readers of lru_disable_count are protected by either disabling preemption or rcu_read_lock: preempt_disable, local_irq_disable [bh_lru_lock()] rcu_read_lock [rt_spin_lock CONFIG_PREEMPT_RT] preempt_disable [local_lock !CONFIG_PREEMPT_RT] Since v5.1 kernel, synchronize_rcu() is guaranteed to wait on preempt_disable() regions of code. So any CPU which sees lru_disable_count = 0 will have exited the critical section when synchronize_rcu() returns. Fixes: ... Thanks.