Re: linux-next: manual merge of the rcu tree with the tip tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 31, 2017 at 9:03 PM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Aug 01, 2017 at 12:04:05AM +0000, Mathieu Desnoyers wrote:
>> ----- On Jul 31, 2017, at 12:13 PM, Paul E. McKenney paulmck@xxxxxxxxxxxxxxxxxx wrote:
>>

>                                                         Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit fde19879b6bd1abc0c1d4d5f945efed61bf7eb8c
> Author: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> Date:   Fri Jul 28 16:40:40 2017 -0400
>
>     membarrier: Expedited private command
>
>     Implement MEMBARRIER_CMD_PRIVATE_EXPEDITED with IPIs using cpumask built
>     from all runqueues for which current thread's mm is the same as the
>     thread calling sys_membarrier. It executes faster than the non-expedited
>     variant (no blocking). It also works on NOHZ_FULL configurations.
>
>     Scheduler-wise, it requires a memory barrier before and after context
>     switching between processes (which have different mm). The memory
>     barrier before context switch is already present. For the barrier after
>     context switch:
>
>     * Our TSO archs can do RELEASE without being a full barrier. Look at
>       x86 spin_unlock() being a regular STORE for example.  But for those
>       archs, all atomics imply smp_mb and all of them have atomic ops in
>       switch_mm() for mm_cpumask().

I think that, on x86, context switches, even without mm changes, must
at least flush the store buffer (maybe SFENCE is okay) to avoid
visible inconsistency due to store-buffer forwarding.

Anyway, can you document whatever property you require with a comment
in switch_mm() or wherever you're finding that property so that future
arch changes don't break it?

> +static void membarrier_private_expedited(void)
> +{
> +       int cpu;
> +       bool fallback = false;
> +       cpumask_var_t tmpmask;
> +
> +       if (num_online_cpus() == 1)
> +               return;
> +
> +       /*
> +        * Matches memory barriers around rq->curr modification in
> +        * scheduler.
> +        */
> +       smp_mb();       /* system call entry is not a mb. */
> +
> +       /*
> +        * Expedited membarrier commands guarantee that they won't
> +        * block, hence the GFP_NOWAIT allocation flag and fallback
> +        * implementation.
> +        */
> +       if (!zalloc_cpumask_var(&tmpmask, GFP_NOWAIT)) {
> +               /* Fallback for OOM. */
> +               fallback = true;
> +       }
> +
> +       cpus_read_lock();
> +       for_each_online_cpu(cpu) {
> +               struct task_struct *p;
> +
> +               /*
> +                * Skipping the current CPU is OK even through we can be
> +                * migrated at any point. The current CPU, at the point
> +                * where we read raw_smp_processor_id(), is ensured to
> +                * be in program order with respect to the caller
> +                * thread. Therefore, we can skip this CPU from the
> +                * iteration.
> +                */
> +               if (cpu == raw_smp_processor_id())
> +                       continue;
> +               rcu_read_lock();
> +               p = task_rcu_dereference(&cpu_rq(cpu)->curr);
> +               if (p && p->mm == current->mm) {

I'm a bit surprised you're iterating all CPUs instead of just CPUs in
mm_cpumask().
--
To unsubscribe from this list: send the line "unsubscribe linux-next" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux