Re: [PATCH 1/3] rcu: Use static initializer for krc.lock

Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> · Thu, 16 Apr 2020 17:18:24 +0200

On 2020-04-16 10:42:54 [-0400], Joel Fernandes wrote:
> Hi Sebastian,
Hi Joel,

> > @@ -3139,10 +3136,8 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> >  	unsigned long flags;
> >  	struct kfree_rcu_cpu *krcp;
> >  
> > -	local_irq_save(flags);	// For safely calling this_cpu_ptr().
> > -	krcp = this_cpu_ptr(&krc);
> > -	if (krcp->initialized)
> > -		spin_lock(&krcp->lock);
> > +	krcp = raw_cpu_ptr(&krc);
> > +	spin_lock_irqsave(&krcp->lock, flags);
> 
> I agree with the patch, except for this bit. Would it be ok to split the
> other parts of this patch into separate patch(es)?

if you want, I could split it. Part of the goal is to get rid of the
local_irq_save(). I don't mind if it happens a patch later :)

> For this part of the patch, I am wondering what is the value in it. For one,

local_irq_save() + spin_lock() is the problem, see
    https://www.kernel.org/doc/html/latest/locking/locktypes.html#spinlock-t-and-rwlock-t

(the link-instead-explaining part is nice)

> we don't want migration between sampling the value of the pointer, and
> actually using it. One reason at least is because we have certain resources
> in the per-cpu structure that take advantage of cache locality (such as the
> kfree_rcu_cpu::krw_arr array). In the common case, the callback would be
> executed on the same CPU it is queued on. All of the access is local to the
> CPU.

In the common case, yes.

> I know the work item can still be handled on other CPUs so we are not
> strictly doing it in the worst case (however per my understanding, the work
> item is executed in the same CPU most of the time).

The worker is migrated if the CPU goes offline for instance.

> Also on a slightly related note, could you share with me why this_cpu_ptr()
> is unsafe to call in non-preemptible context, but raw_cpu_ptr() is ok? Just
> curious on that.

I wouldn't use safe/unsafe as a term to describe the situation. Both do
the same however this_cpu_ptr() raises a warning (given it is enabled)
if the current context can migrate to another CPU
(check_preemption_disabled()).
If you know what you do and it does not matter whether migration happens
or not, you can use raw_cpu_ptr() to avoid the warning. In this case it
is okay because CPU migration does not matter here since the data
structure is protected with a lock. Using the data structure from
another CPU (due to accidental migration) is not ideal but _no_
assumption happens later on for dealing with the current or further
data. The timer/worker will be scheduled on the "wrong" CPU but this
again has no bad consequences. The timer and worker use container_of()
so they operate on the correct resources. Would they instead use
per_cpu_ptr(krc, smp_processor_id()) then it would operate on the wrong
data.

Regarding the safe/unsafe: Would this_cpu_ptr() trigger a warning and
you add preempt_disable() around the block, where you use the per-CPU
data, then you would silence the warning and everything would look fine.
If this per-CPU data structure would also be accessed from interrupt
context then you would get inconsistent data and no warning.
What I'm trying to say is that even this_cpu_ptr() is not "safe" if used
without a thought.

> thanks,
> 
>  - Joel

Sebastian