Re: [RFC PATCH v3 1/2] membarrier: Provide register expedited private command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 22, 2017 at 11:22:06AM +0800, Boqun Feng wrote:
> Hi Mathieu,
> 
> On Tue, Sep 19, 2017 at 06:13:41PM -0400, Mathieu Desnoyers wrote:
> > Provide a new command allowing processes to register their intent to use
> > the private expedited command.
> > 
> > This allows PowerPC to skip the full memory barrier in switch_mm(), and
> > only issue the barrier when scheduling into a task belonging to a
> > process that has registered to use expedited private.
> > 
> > Processes are now required to register before using
> > MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM.
> > 
> 
> Sorry I'm late for the party, but I couldn't stop thinking whether we
> could avoid the register thing at all, because the registering makes
> sys_membarrier() more complex(both for the interface and the
> implementation). So how about we trade-off a little bit by taking
> some(not all) the rq->locks?
> 
> The idea is in membarrier_private_expedited(), we go through all ->curr
> on each CPU and 
> 
> 1)	If it's a userspace task and its ->mm is matched, we send an ipi
> 
> 2)	If it's a kernel task, we skip
> 
> 	(Because there will be a smp_mb() implied by mmdrop(), when it
> 	switchs to userspace task).
> 
> 3)	If it's a userspace task and its ->mm is not matched, we take
> 	the corresponding rq->lock and check rq->curr again, if its ->mm
> 	matched, we send an ipi, otherwise we do nothing.
> 
> 	(Because if we observe rq->curr is not matched with rq->lock
> 	held, when a task having matched ->mm schedules in, the rq->lock
> 	pairing along with the smp_mb__after_spinlock() will guarantee
> 	it observes all memory ops before sys_membarrir()).
> 
> membarrier_private_expedited() will look like this if we choose this
> way:
> 
> void membarrier_private_expedited()
> {
> 	int cpu;
> 	bool fallback = false;
> 	cpumask_var_t tmpmask;
> 	struct rq_flags rf;
> 
> 
> 	if (num_online_cpus() == 1)
> 		return;
> 
> 	smp_mb();
> 
> 	if (!zalloc_cpumask_var(&tmpmask, GFP_NOWAIT)) {
> 		/* Fallback for OOM. */
> 		fallback = true;
> 	}
> 
> 	cpus_read_lock();
> 	for_each_online_cpu(cpu) {
> 		struct task_struct *p;
> 
> 		if (cpu == raw_smp_processor_id())
> 			continue;
> 
> 		rcu_read_lock();
> 		p = task_rcu_dereference(&cpu_rq(cpu)->curr);
> 
> 		if (!p) {
> 			rcu_read_unlock();
> 			continue;
> 		}
> 
> 		if (p->mm == current->mm) {
> 			if (!fallback)
> 				__cpumask_set_cpu(cpu, tmpmask);
> 			else
> 				smp_call_function_single(cpu, ipi_mb, NULL, 1);
> 		}
> 
> 		if (p->mm == current->mm || !p->mm) {
> 			rcu_read_unlock();
> 			continue;
> 		}
> 
> 		rcu_read_unlock();
> 		
> 		/*
> 		 * This should be a arch-specific code, as we don't
> 		 * need it at else place other than some archs without
> 		 * a smp_mb() in switch_mm() (i.e. powerpc)
> 		 */
> 		rq_lock_irq(cpu_rq(cpu), &rf);
> 		if (p->mm == current->mm) {

Oops, this one should be

		if (cpu_curr(cpu)->mm == current->mm)

> 			if (!fallback)
> 				__cpumask_set_cpu(cpu, tmpmask);
> 			else
> 				smp_call_function_single(cpu, ipi_mb, NULL, 1);

, and this better be moved out of the lock rq->lock critical section.

Regards,
Boqun

> 		}
> 		rq_unlock_irq(cpu_rq(cpu), &rf);
> 	}
> 	if (!fallback) {
> 		smp_call_function_many(tmpmask, ipi_mb, NULL, 1);
> 		free_cpumask_var(tmpmask);
> 	}
> 	cpus_read_unlock();
> 
> 	smp_mb();
> }
> 
> Thoughts?
> 
> Regards,
> Boqun
> 
[...]

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux