RE: [PATCH v2] rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Nov 28, 2022, at 11:54 PM, Zhang, Qiang1 <qiang1.zhang@xxxxxxxxx> wrote:
> 
> On Mon, Nov 28, 2022 at 10:34:28PM +0800, Zqiang wrote:
>> Currently, invoke rcu_tasks_rude_wait_gp() to wait one rude
>> RCU-tasks grace period, if __num_online_cpus == 1, will return
>> directly, indicates the end of the rude RCU-task grace period.
>> suppose the system has two cpus, consider the following scenario:
>> 
>>    CPU0                                   CPU1 (going offline)
>>                          migration/1 task:
>>                                      cpu_stopper_thread
>>                                       -> take_cpu_down
>>                                          -> _cpu_disable
>>                               (dec __num_online_cpus)
>>                                          ->cpuhp_invoke_callback
>>                                                preempt_disable
>>                        access old_data0
>>           task1
>> del old_data0                                  .....
>> synchronize_rcu_tasks_rude()
>> task1 schedule out
>> ....
>> task2 schedule in
>> rcu_tasks_rude_wait_gp()
>>     ->__num_online_cpus == 1
>>       ->return
>> ....
>> task1 schedule in
>> ->free old_data0
>>                                                preempt_enable
>> 
>> when CPU1 dec __num_online_cpus and __num_online_cpus is equal one,
>> the CPU1 has not finished offline, stop_machine task(migration/1)
>> still running on CPU1, maybe still accessing 'old_data0', but the
>> 'old_data0' has freed on CPU0.
>> 
>> This commit add cpus_read_lock/unlock() protection before accessing
>> __num_online_cpus variables, to ensure that the CPU in the offline
>> process has been completed offline.
>> 
>> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
>> 
>> First, good eyes and good catch!!!
>> 
>> The purpose of that check for num_online_cpus() is not performance
>> on single-CPU systems, but rather correct operation during early boot.
>> So a simpler way to make that work is to check for RCU_SCHEDULER_RUNNING,
>> for example, as follows:
>> 
>>    if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING &&
>>        num_online_cpus() <= 1)
>>        return;    // Early boot fastpath for only one CPU.
> 
> Hi Paul
> 
> During system startup, because the RCU_SCHEDULER_RUNNING is set after starting other CPUs, 
> 
>              CPU0                                                                       CPU1                                                                 
> 
> if (rcu_scheduler_active !=                                    
>    RCU_SCHEDULER_RUNNING &&
>           __num_online_cpus  == 1)                                               
>    return;                                                                         inc  __num_online_cpus
>                            (__num_online_cpus == 2)
> 
> CPU0 didn't notice the update of the __num_online_cpus variable by CPU1 in time
> Can we move rcu_set_runtime_mode() before smp_init()
> any thoughts?
>
>Is anyone expected to do rcu-tasks operation before the scheduler is running? 

Not sure if such a scenario exists.

>Typically this requires the tasks to context switch which is a scheduler operation.
>
>If the scheduler is not yet running, then I don’t think missing an update the __num_online_cpus matters since no one does a tasks-RCU synchronize.

Hi Joel

After the kernel_init task runs, before calling smp_init() to starting other CPUs, 
the scheduler haven been initialization, task context switching can occur.

Thanks
Zqiang

>
>Or did I miss something?
>
>Thanks.
>
>
>
> 
> Thanks
> Zqiang
> 
>> 
>> This works because rcu_scheduler_active is set to RCU_SCHEDULER_RUNNING
>> long before it is possible to offline CPUs.
>> 
>> Yes, schedule_on_each_cpu() does do cpus_read_lock(), again, good eyes,
>> and it also unnecessarily does the schedule_work_on() the current CPU,
>> but the code calling synchronize_rcu_tasks_rude() is on high-overhead
>> code paths, so this overhead is down in the noise.
>> 
>> Until further notice, anyway.
>> 
>> So simplicity is much more important than performance in this code.
>> So just adding the check for RCU_SCHEDULER_RUNNING should fix this,
>> unless I am missing something (always possible!).
>> 
>>                            Thanx, Paul
>> 
>> ---
>> kernel/rcu/tasks.h | 20 ++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
>> index 4a991311be9b..08e72c6462d8 100644
>> --- a/kernel/rcu/tasks.h
>> +++ b/kernel/rcu/tasks.h
>> @@ -1033,14 +1033,30 @@ static void rcu_tasks_be_rude(struct work_struct *work)
>> {
>> }
>> 
>> +static DEFINE_PER_CPU(struct work_struct, rude_work);
>> +
>> // Wait for one rude RCU-tasks grace period.
>> static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
>> {
>> +    int cpu;
>> +    struct work_struct *work;
>> +
>> +    cpus_read_lock();
>>    if (num_online_cpus() <= 1)
>> -        return;    // Fastpath for only one CPU.
>> +        goto end;// Fastpath for only one CPU.
>> 
>>    rtp->n_ipis += cpumask_weight(cpu_online_mask);
>> -    schedule_on_each_cpu(rcu_tasks_be_rude);
>> +    for_each_online_cpu(cpu) {
>> +        work = per_cpu_ptr(&rude_work, cpu);
>> +        INIT_WORK(work, rcu_tasks_be_rude);
>> +        schedule_work_on(cpu, work);
>> +    }
>> +
>> +    for_each_online_cpu(cpu)
>> +        flush_work(per_cpu_ptr(&rude_work, cpu));
>> +
>> +end:
>> +    cpus_read_unlock();
>> }
>> 
>> void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func);
>> -- 
>> 2.25.1
>> 




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux