Re: [PATCH] iio: buffer: Silence lock nesting splat

Jonathan Cameron <jic23@xxxxxxxxxx> · Sat, 20 Aug 2022 12:06:40 +0100

On Tue, 16 Aug 2022 10:08:28 +0200
Vincent Whitchurch <vincent.whitchurch@xxxxxxxx> wrote:

> If an IIO driver uses callbacks from another IIO driver and calls
> iio_channel_start_all_cb() from one of its buffer setup ops, then
> lockdep complains due to the lock nesting, as in the below example with
> lmp91000.  Since the locks are being taken on different IIO devices,
> there is no actual deadlock, so add lock nesting annotation to silence
> the spurious warning.
> 
>  ============================================
>  WARNING: possible recursive locking detected
>  6.0.0-rc1+ #10 Not tainted
>  --------------------------------------------
>  python3/23 is trying to acquire lock:
>  0000000064c944c0 (&indio_dev->mlock){+.+.}-{3:3}, at: iio_update_buffers+0x62/0x180
> 
>  but task is already holding lock:
>  00000000636b64c0 (&indio_dev->mlock){+.+.}-{3:3}, at: enable_store+0x4d/0x100
> 
>  other info that might help us debug this:
>   Possible unsafe locking scenario:
> 
>         CPU0
>         ----
>    lock(&indio_dev->mlock);
>    lock(&indio_dev->mlock);
> 
>   *** DEADLOCK ***
> 
>   May be due to missing lock nesting notation
> 
>  5 locks held by python3/23:
>   #0: 00000000636b5420 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x67/0x100
>   #1: 0000000064c19280 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x13a/0x270
>   #2: 0000000064c3d9e0 (kn->active#14){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x149/0x270
>   #3: 00000000636b64c0 (&indio_dev->mlock){+.+.}-{3:3}, at: enable_store+0x4d/0x100
>   #4: 0000000064c945c8 (&iio_dev_opaque->info_exist_lock){+.+.}-{3:3}, at: iio_update_buffers+0x4f/0x180
> 
>  stack backtrace:
>  CPU: 0 PID: 23 Comm: python3 Not tainted 6.0.0-rc1+ #10
>  Call Trace:
>   dump_stack+0x1a/0x1c
>   __lock_acquire.cold+0x407/0x42d
>   lock_acquire+0x1ed/0x310
>   __mutex_lock+0x72/0xde0
>   mutex_lock_nested+0x1d/0x20
>   iio_update_buffers+0x62/0x180
>   iio_channel_start_all_cb+0x1c/0x20 [industrialio_buffer_cb]
>   lmp91000_buffer_postenable+0x1b/0x20 [lmp91000]
>   __iio_update_buffers+0x50b/0xd80
>   enable_store+0x81/0x100
>   dev_attr_store+0xf/0x20
>   sysfs_kf_write+0x4c/0x70
>   kernfs_fop_write_iter+0x179/0x270
>   new_sync_write+0x99/0x120
>   vfs_write+0x2c1/0x470
>   ksys_write+0x67/0x100
>   sys_write+0x10/0x20
> 
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@xxxxxxxx>

I'm wondering if this is sufficient.
At first glance there are a whole bunch of other possible cases of this.
Any consumer driver that calls iio_device_claim_direct_mode() would be a
problem - though I'm not sure any do?

I'm not sure I properly understand lockdep notations, but I thought the
point was we needed to define a hierarchy?  To do that here we need
an IIO driver that is a consumer to somehow let the IIO core know that
and mark all calls to the locks appropriately.  This gets trickier
as we allow 3+ levels of IIO drivers calling into each other.

We should also think about how to prevent recursion if there are 3
IIO drivers involved.

+CC Peter as most of the fun cases of IIO consumers were from him.

Perhaps this notation is a step in the right direction and we can
look for other problem cases later.  

One side note is that it's not immediately obvious that iio_update_buffers()
is called only from consumers (the other paths use __iio_update_buffers() directly
so if we make this change we should consider renaming that function or
at very least adding some documentation.

Jonathan

> ---
>  drivers/iio/industrialio-buffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c
> index acc2b6c05d57..27868ed092d0 100644
> --- a/drivers/iio/industrialio-buffer.c
> +++ b/drivers/iio/industrialio-buffer.c
> @@ -1255,7 +1255,7 @@ int iio_update_buffers(struct iio_dev *indio_dev,
>  		return -EINVAL;
>  
>  	mutex_lock(&iio_dev_opaque->info_exist_lock);
> -	mutex_lock(&indio_dev->mlock);
> +	mutex_lock_nested(&indio_dev->mlock, SINGLE_DEPTH_NESTING);
>  
>  	if (insert_buffer && iio_buffer_is_active(insert_buffer))
>  		insert_buffer = NULL;