Re: [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running

Oleg Nesterov <oleg@xxxxxxxxxx> · Tue, 10 Mar 2015 18:28:16 +0100

On 03/10, Linus Torvalds wrote:
>
> On Sat, Mar 7, 2015 at 9:13 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >> +             /*
> >> +              * Ensure we emit the owner->on_cpu, dereference _after_
> >> +              * checking sem->owner still matches owner, if that fails,
> >> +              * owner might point to free()d memory, if it still matches,
> >> +              * the rcu_read_lock() ensures the memory stays valid.
> >                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > Yes, this is another case when we wrongly assume this.
> >
> > Peter, should I resend
> >
> >         [PATCH 3/3] introduce task_rcu_dereference()
> >         http://marc.info/?l=linux-kernel&m=141443631413914
> >
> > ? or should we add another call_rcu() in finish_task_switch() (like -rt does)
> > to make this true?
>
> I think we should just make 'task_struct_cachep' have SLAB_DESTROY_BY_RCU.

This is what I initially suggested too, but then tried to argue with.
But it seems that I lost if you too prefer SLAB_DESTROY_BY_RCU.

Yes, SLAB_DESTROY_BY_RCU will work in this case because we recheck
->owner in a loop. And because task->on_cpu is just a word we can
safely read.

But this won't fix other problems we might have. For example, suppose
that we will need get_task_struct(owner) in this code, this won't work.

Or, as Kirill pointed out, lets look at "tsk = ACCESS_ONCE(cpu_rq(cpu)->curr)"
in task_numa_group(). Even if this will be "fixed" by SLAB_DESTROY_BY_RCU,
this code won't be correct anyway. Even if (I think) it will be safe to
dereference ->numa_group as well.

But OK, I won't argue.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html