Re: [PATCH] ACPI / PM: Fix acpi_pm_notifier_lock vs. flush_workqueue() deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 8, 2017 at 12:06 AM, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> On Tuesday, November 7, 2017 11:47:54 PM CET Rafael J. Wysocki wrote:
>> On Tue, Nov 7, 2017 at 10:08 PM, Ville Syrjala
>> <ville.syrjala@xxxxxxxxxxxxxxx> wrote:
>> > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
>> >
>> > acpi_remove_pm_notifier() ends up calling flush_workqueue() while
>> > holding acpi_pm_notifier_lock, and that same lock is taken by
>> > by the work via acpi_pm_notify_handler(). This can deadlock.
>>
>> OK, good catch!
>>
>> [cut]
>>
>> >
>> > Cc: stable@xxxxxxxxxxxxxxx
>> > Cc: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>> > Cc: Len Brown <lenb@xxxxxxxxxx>
>> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> > Cc: Tejun Heo <tj@xxxxxxxxxx>
>> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> > Fixes: c072530f391e ("ACPI / PM: Revork the handling of ACPI device wakeup notifications")
>> > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
>> > ---
>> >  drivers/acpi/device_pm.c | 21 ++++++++++++---------
>> >  1 file changed, 12 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
>> > index fbcc73f7a099..18af71057b44 100644
>> > --- a/drivers/acpi/device_pm.c
>> > +++ b/drivers/acpi/device_pm.c
>> > @@ -387,6 +387,7 @@ EXPORT_SYMBOL(acpi_bus_power_manageable);
>> >
>> >  #ifdef CONFIG_PM
>> >  static DEFINE_MUTEX(acpi_pm_notifier_lock);
>> > +static DEFINE_MUTEX(acpi_pm_notifier_install_lock);
>> >
>> >  void acpi_pm_wakeup_event(struct device *dev)
>> >  {
>> > @@ -443,24 +444,25 @@ acpi_status acpi_add_pm_notifier(struct acpi_device *adev, struct device *dev,
>> >         if (!dev && !func)
>> >                 return AE_BAD_PARAMETER;
>> >
>> > -       mutex_lock(&acpi_pm_notifier_lock);
>> > +       mutex_lock(&acpi_pm_notifier_install_lock);
>> >
>> >         if (adev->wakeup.flags.notifier_present)
>> >                 goto out;
>> >
>> > -       adev->wakeup.ws = wakeup_source_register(dev_name(&adev->dev));
>> > -       adev->wakeup.context.dev = dev;
>> > -       adev->wakeup.context.func = func;
>> > -
>>
>> But this doesn't look good to me.
>>
>> notifier_present should be checked under acpi_pm_notifier_lock.
>>
>> Actually, acpi_install_notify_handler() itself need not be called
>> under the lock, because it does sufficient checks of its own.
>>
>> So say you do
>>
>> mutex_lock(&acpi_pm_notifier_lock);
>>
>> if (adev->wakeup.context.dev)
>>         goto out;
>>
>> adev->wakeup.ws = wakeup_source_register(dev_name(&adev->dev));
>> adev->wakeup.context.dev = dev;
>> adev->wakeup.context.func = func;
>>
>> mutex_unlock(&acpi_pm_notifier_lock);
>>
>> >         status = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
>> >                                              acpi_pm_notify_handler, NULL);
>> >         if (ACPI_FAILURE(status))
>> >                 goto out;
>> >
>> > +       mutex_lock(&acpi_pm_notifier_lock);
>>
>> And here you just set notifier_present under acpi_pm_notifier_lock.
>>
>> > +       adev->wakeup.ws = wakeup_source_register(dev_name(&adev->dev));
>> > +       adev->wakeup.context.dev = dev;
>> > +       adev->wakeup.context.func = func;
>> >         adev->wakeup.flags.notifier_present = true;
>> > +       mutex_unlock(&acpi_pm_notifier_lock);
>> >
>> >   out:
>> > -       mutex_unlock(&acpi_pm_notifier_lock);
>> > +       mutex_unlock(&acpi_pm_notifier_install_lock);
>> >         return status;
>> >  }
>>
>> Then on removal you can clear notifier_present first and drop the lock
>> around the acpi_remove_notify_handler() call and nothing bad will
>> happen.
>>
>> If you call acpi_add_pm_notifier() twice in parallel, the first
>> instance will set context.dev and the second one will see it set and
>> bail out.  The first instance will then do the rest.
>>
>> If you call acpi_remove_pm_notifier() twice in a row, the first
>> instance will see notifier_present set and will clear it, so the
>> second one will see notifier_present unset and it will bail out.
>>
>> Now, if you call acpi_remove_pm_notifier() in parallel with
>> acpi_add_pm_notifier(), either the former will see notifier_present
>> unset and bail out, or the latter will see context.dev unset and bail
>> out.
>>
>> It doesn't look like the outer lock is needed, or have I missed anything?
>
> So something like the below (totally untested) should work too, shouldn't it?

There is a problem if a device is removed while acpi_add_pm_notifier()
is still in progress, in which case with my patch the
acpi_remove_pm_notifier() called from the removal path may bail out
prematurely (that doesn't seem likely to happen, but still I don't see
why it cannot happen), so I'll just use your patch. :-)

Thanks,
Rafael




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]