Re: [PATCH v2] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

Alexey Klimov <aklimov@xxxxxxxxxx> · Tue, 16 Mar 2021 03:15:13 +0000

On Fri, Feb 12, 2021 at 7:42 PM Daniel Jordan
<daniel.m.jordan@xxxxxxxxxx> wrote:
>
> Alexey Klimov <aklimov@xxxxxxxxxx> writes:
> > int cpu_device_up(struct device *dev)
>
> Yeah, definitely better to do the wait here.
>
> >  int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> >  {
> > -     int cpu, ret = 0;
> > +     struct device *dev;
> > +     cpumask_var_t mask;
> > +     int cpu, ret;
> > +
> > +     if (!zalloc_cpumask_var(&mask, GFP_KERNEL))
> > +             return -ENOMEM;
> >
> > +     ret = 0;
> >       cpu_maps_update_begin();
> >       for_each_online_cpu(cpu) {
> >               if (topology_is_primary_thread(cpu))
> > @@ -2099,18 +2098,35 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> >                * called under the sysfs hotplug lock, so it is properly
> >                * serialized against the regular offline usage.
> >                */
> > -             cpuhp_offline_cpu_device(cpu);
> > +             dev = get_cpu_device(cpu);
> > +             dev->offline = true;
> > +
> > +             cpumask_set_cpu(cpu, mask);
> >       }
> >       if (!ret)
> >               cpu_smt_control = ctrlval;
> >       cpu_maps_update_done();
> > +
> > +     /* Tell user space about the state changes */
> > +     for_each_cpu(cpu, mask) {
> > +             dev = get_cpu_device(cpu);
> > +             kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
> > +     }
> > +
> > +     free_cpumask_var(mask);
> >       return ret;
> >  }
>
> Hrm, should the dev manipulation be kept in one place, something like
> this?

The first section of comment seems problematic to me with regards to such move:

                 * As this needs to hold the cpu maps lock it's impossible
                 * to call device_offline() because that ends up calling
                 * cpu_down() which takes cpu maps lock. cpu maps lock
                 * needs to be held as this might race against in kernel
                 * abusers of the hotplug machinery (thermal management).

Cpu maps lock is released in cpu_maps_update_done() hence we will move
dev->offline out of cpu maps lock. Maybe I misunderstood the comment
and it relates to calling cpu_down_maps_locked() under lock to avoid
race?
I failed to find the abusers of hotplug machinery in drivers/thermal/*
to track down the logic of potential race but I might have overlooked.
Anyway, if we move the update of dev->offline out, then it makes sense
to restore cpuhp_{offline,online}_cpu_device back and just use it.

I guess I'll update and re-send the patch and see how it goes.

> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 8817ccdc8e112..aa21219a7b7c4 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -2085,11 +2085,20 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
>                 ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
>                 if (ret)
>                         break;
> +
> +               cpumask_set_cpu(cpu, mask);
> +       }
> +       if (!ret)
> +               cpu_smt_control = ctrlval;
> +       cpu_maps_update_done();
> +
> +       /* Tell user space about the state changes */
> +       for_each_cpu(cpu, mask) {
>                 /*
> -                * As this needs to hold the cpu maps lock it's impossible
> +                * When the cpu maps lock was taken above it was impossible
>                  * to call device_offline() because that ends up calling
>                  * cpu_down() which takes cpu maps lock. cpu maps lock
> -                * needs to be held as this might race against in kernel
> +                * needed to be held as this might race against in kernel
>                  * abusers of the hotplug machinery (thermal management).
>                  *
>                  * So nothing would update device:offline state. That would

Yeah, reading how you re-phrased it, this seems to be about
cpu_down_maps_locked()/device_offline() locks and race rather than
updating stale dev->offline.

Thank you,
Alexey