On Fri, Sep 22, 2017 at 11:22:06AM +0800, Boqun Feng wrote: > Hi Mathieu, > > On Tue, Sep 19, 2017 at 06:13:41PM -0400, Mathieu Desnoyers wrote: > > Provide a new command allowing processes to register their intent to use > > the private expedited command. > > > > This allows PowerPC to skip the full memory barrier in switch_mm(), and > > only issue the barrier when scheduling into a task belonging to a > > process that has registered to use expedited private. > > > > Processes are now required to register before using > > MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM. > > > > Sorry I'm late for the party, but I couldn't stop thinking whether we > could avoid the register thing at all, because the registering makes > sys_membarrier() more complex(both for the interface and the > implementation). So how about we trade-off a little bit by taking > some(not all) the rq->locks? > > The idea is in membarrier_private_expedited(), we go through all ->curr > on each CPU and > > 1) If it's a userspace task and its ->mm is matched, we send an ipi > > 2) If it's a kernel task, we skip > > (Because there will be a smp_mb() implied by mmdrop(), when it > switchs to userspace task). > > 3) If it's a userspace task and its ->mm is not matched, we take > the corresponding rq->lock and check rq->curr again, if its ->mm > matched, we send an ipi, otherwise we do nothing. > > (Because if we observe rq->curr is not matched with rq->lock > held, when a task having matched ->mm schedules in, the rq->lock > pairing along with the smp_mb__after_spinlock() will guarantee > it observes all memory ops before sys_membarrir()). > > membarrier_private_expedited() will look like this if we choose this > way: > > void membarrier_private_expedited() > { > int cpu; > bool fallback = false; > cpumask_var_t tmpmask; > struct rq_flags rf; > > > if (num_online_cpus() == 1) > return; > > smp_mb(); > > if (!zalloc_cpumask_var(&tmpmask, GFP_NOWAIT)) { > /* Fallback for OOM. */ > fallback = true; > } > > cpus_read_lock(); > for_each_online_cpu(cpu) { > struct task_struct *p; > > if (cpu == raw_smp_processor_id()) > continue; > > rcu_read_lock(); > p = task_rcu_dereference(&cpu_rq(cpu)->curr); > > if (!p) { > rcu_read_unlock(); > continue; > } > > if (p->mm == current->mm) { > if (!fallback) > __cpumask_set_cpu(cpu, tmpmask); > else > smp_call_function_single(cpu, ipi_mb, NULL, 1); > } > > if (p->mm == current->mm || !p->mm) { > rcu_read_unlock(); > continue; > } > > rcu_read_unlock(); > > /* > * This should be a arch-specific code, as we don't > * need it at else place other than some archs without > * a smp_mb() in switch_mm() (i.e. powerpc) > */ > rq_lock_irq(cpu_rq(cpu), &rf); > if (p->mm == current->mm) { Oops, this one should be if (cpu_curr(cpu)->mm == current->mm) > if (!fallback) > __cpumask_set_cpu(cpu, tmpmask); > else > smp_call_function_single(cpu, ipi_mb, NULL, 1); , and this better be moved out of the lock rq->lock critical section. Regards, Boqun > } > rq_unlock_irq(cpu_rq(cpu), &rf); > } > if (!fallback) { > smp_call_function_many(tmpmask, ipi_mb, NULL, 1); > free_cpumask_var(tmpmask); > } > cpus_read_unlock(); > > smp_mb(); > } > > Thoughts? > > Regards, > Boqun > [...]
Attachment:
signature.asc
Description: PGP signature