Re: [PATCH] hwmon: (acpi_power_meter) Fix lockdep splat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020/02/20 8:03, Guenter Roeck wrote:
> Damien Le Moal reports a lockdep splat with the acpi_power_meter,
> observed with Linux v5.5 and later.
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.6.0-rc2+ #629 Not tainted
> ------------------------------------------------------
> python/1397 is trying to acquire lock:
> ffff888619080070 (&resource->lock){+.+.}, at: show_power+0x3c/0xa0 [acpi_power_meter]
> 
>                but task is already holding lock:
> ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160
> 
>                which lock already depends on the new lock.
> 
>                the existing dependency chain (in reverse order) is:
> 
>                -> #1 (kn->count#119){++++}:
>        __kernfs_remove+0x626/0x7e0
>        kernfs_remove_by_name_ns+0x41/0x80
>        remove_attrs+0xcb/0x3c0 [acpi_power_meter]
>        acpi_power_meter_notify+0x1f7/0x310 [acpi_power_meter]
>        acpi_ev_notify_dispatch+0x198/0x1f3
>        acpi_os_execute_deferred+0x4d/0x70
>        process_one_work+0x7c8/0x1340
>        worker_thread+0x94/0xc70
>        kthread+0x2ed/0x3f0
>        ret_from_fork+0x24/0x30
> 
>                -> #0 (&resource->lock){+.+.}:
>        __lock_acquire+0x20be/0x49b0
>        lock_acquire+0x127/0x340
>        __mutex_lock+0x15b/0x1350
>        show_power+0x3c/0xa0 [acpi_power_meter]
>        dev_attr_show+0x3f/0x80
>        sysfs_kf_seq_show+0x216/0x410
>        seq_read+0x407/0xf90
>        vfs_read+0x152/0x2c0
>        ksys_read+0xf3/0x1d0
>        do_syscall_64+0x95/0x1010
>        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
>                other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(kn->count#119);
>                                lock(&resource->lock);
>                                lock(kn->count#119);
>   lock(&resource->lock);
> 
>                 *** DEADLOCK ***
> 4 locks held by python/1397:
>  #0: ffff8890242d64e0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x9b/0xb0
>  #1: ffff889040be74e0 (&p->lock){+.+.}, at: seq_read+0x6b/0xf90
>  #2: ffff8890448eb880 (&of->mutex){+.+.}, at: kernfs_seq_start+0x47/0x160
>  #3: ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160
> 
>                stack backtrace:
> CPU: 10 PID: 1397 Comm: python Not tainted 5.6.0-rc2+ #629
> Hardware name: Supermicro Super Server/X11DPL-i, BIOS 3.1 05/21/2019
> Call Trace:
>  dump_stack+0x97/0xe0
>  check_noncircular+0x32e/0x3e0
>  ? print_circular_bug.isra.0+0x1e0/0x1e0
>  ? unwind_next_frame+0xb9a/0x1890
>  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
>  ? graph_lock+0x79/0x170
>  ? __lockdep_reset_lock+0x3c0/0x3c0
>  ? mark_lock+0xbc/0x1150
>  __lock_acquire+0x20be/0x49b0
>  ? mark_held_locks+0xe0/0xe0
>  ? stack_trace_save+0x91/0xc0
>  lock_acquire+0x127/0x340
>  ? show_power+0x3c/0xa0 [acpi_power_meter]
>  ? device_remove_bin_file+0x10/0x10
>  ? device_remove_bin_file+0x10/0x10
>  __mutex_lock+0x15b/0x1350
>  ? show_power+0x3c/0xa0 [acpi_power_meter]
>  ? show_power+0x3c/0xa0 [acpi_power_meter]
>  ? mutex_lock_io_nested+0x11f0/0x11f0
>  ? lock_downgrade+0x6a0/0x6a0
>  ? kernfs_seq_start+0x47/0x160
>  ? lock_acquire+0x127/0x340
>  ? kernfs_seq_start+0x6a/0x160
>  ? device_remove_bin_file+0x10/0x10
>  ? show_power+0x3c/0xa0 [acpi_power_meter]
>  show_power+0x3c/0xa0 [acpi_power_meter]
>  dev_attr_show+0x3f/0x80
>  ? memset+0x20/0x40
>  sysfs_kf_seq_show+0x216/0x410
>  seq_read+0x407/0xf90
>  ? security_file_permission+0x16f/0x2c0
>  vfs_read+0x152/0x2c0
> 
> Problem is that reading an attribute takes the kernfs lock in the kernfs
> code, then resource->lock in the driver. During an ACPI notification, the
> opposite happens: The resource lock is taken first, followed by the kernfs
> lock when sysfs attributes are removed and re-created. Presumably this is
> now seen due to some locking related changes in kernfs after v5.4, but it
> was likely always a problem.
> 
> Fix the problem by not blindly acquiring the lock in the notification
> function. It is only needed to protect the various update functions.
> However, those update functions are called anyway when sysfs attributes
> are read. This means that we can just stop calling those functions from
> the notifier, and the resource lock in the notifier function is no longer
> needed.
> 
> That leaves two situations:
> 
> First, METER_NOTIFY_CONFIG removes and re-allocates capability strings.
> While it did so under the resource lock, _displaying_ those strings was not
> protected, creating a race condition. To solve this problem, selectively
> protect both removal/creation and reporting of capability attributes with
> the resource lock.
> 
> Second, removing and re-creating the attribute files is no longer protected
> by the resource lock. That doesn't matter since access to each individual
> attribute is protected by the kernfs lock. Userspace may get messed up if
> attributes disappear and reappear under its nose, but that is not different
> than today, and there is nothing we can do about it without major driver
> restructuring.
> 
> Last but not least, when removing the driver, remove attribute functions
> first, then release capability strings. This avoids yet another race
> condition.
> 
> Reported-by: Damien Le Moal <Damien.LeMoal@xxxxxxx>
> Cc: Damien Le Moal <Damien.LeMoal@xxxxxxx>
> Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx>
> ---
> This is not much more than a kludge, but I think it should do the trick.
> Damien, can you give it a try ?

Just tested with latest rc2. Rebooted several times and did not see any
splat with the patch applied. So please feel free to add:

Tested-by: Damien Le Moal <damien.lemoal@xxxxxxx>

Thanks !

> 
>  drivers/hwmon/acpi_power_meter.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/hwmon/acpi_power_meter.c b/drivers/hwmon/acpi_power_meter.c
> index 4cf25458f0b9..0db8ef4fd6e1 100644
> --- a/drivers/hwmon/acpi_power_meter.c
> +++ b/drivers/hwmon/acpi_power_meter.c
> @@ -355,7 +355,9 @@ static ssize_t show_str(struct device *dev,
>  	struct acpi_device *acpi_dev = to_acpi_device(dev);
>  	struct acpi_power_meter_resource *resource = acpi_dev->driver_data;
>  	acpi_string val;
> +	int ret;
>  
> +	mutex_lock(&resource->lock);
>  	switch (attr->index) {
>  	case 0:
>  		val = resource->model_number;
> @@ -372,8 +374,9 @@ static ssize_t show_str(struct device *dev,
>  		val = "";
>  		break;
>  	}
> -
> -	return sprintf(buf, "%s\n", val);
> +	ret = sprintf(buf, "%s\n", val);
> +	mutex_unlock(&resource->lock);
> +	return ret;
>  }
>  
>  static ssize_t show_val(struct device *dev,
> @@ -817,11 +820,12 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
>  
>  	resource = acpi_driver_data(device);
>  
> -	mutex_lock(&resource->lock);
>  	switch (event) {
>  	case METER_NOTIFY_CONFIG:
> +		mutex_lock(&resource->lock);
>  		free_capabilities(resource);
>  		res = read_capabilities(resource);
> +		mutex_unlock(&resource->lock);
>  		if (res)
>  			break;
>  
> @@ -830,15 +834,12 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
>  		break;
>  	case METER_NOTIFY_TRIP:
>  		sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
> -		update_meter(resource);
>  		break;
>  	case METER_NOTIFY_CAP:
>  		sysfs_notify(&device->dev.kobj, NULL, POWER_CAP_NAME);
> -		update_cap(resource);
>  		break;
>  	case METER_NOTIFY_INTERVAL:
>  		sysfs_notify(&device->dev.kobj, NULL, POWER_AVG_INTERVAL_NAME);
> -		update_avg_interval(resource);
>  		break;
>  	case METER_NOTIFY_CAPPING:
>  		sysfs_notify(&device->dev.kobj, NULL, POWER_ALARM_NAME);
> @@ -848,7 +849,6 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
>  		WARN(1, "Unexpected event %d\n", event);
>  		break;
>  	}
> -	mutex_unlock(&resource->lock);
>  
>  	acpi_bus_generate_netlink_event(ACPI_POWER_METER_CLASS,
>  					dev_name(&device->dev), event, 0);
> @@ -912,8 +912,8 @@ static int acpi_power_meter_remove(struct acpi_device *device)
>  	resource = acpi_driver_data(device);
>  	hwmon_device_unregister(resource->hwmon_dev);
>  
> -	free_capabilities(resource);
>  	remove_attrs(resource);
> +	free_capabilities(resource);
>  
>  	kfree(resource);
>  	return 0;
> 


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux