On 11/29/18 11:26 AM, Steve Longerbeam wrote:
Hi Niklas,
On 11/29/18 10:47 AM, Niklas Söderlund wrote:
Hi Steve, Sakari and Hans,
I have been made aware of a possible regression by a few users of
rcar-vin and I'm a bit puzzled how to best handle it. Maybe you can help
me out?
The issue is visible when running with LOCKDEP enabled and it prints a
warning about a possible circular locking dependency, see end of mail.
The warning is triggered because rcar-vin takes a mutex (group->lock) in
its async bound call back while the async framework already holds one
(lisk_lock).
I see two possible solutions to this:
A. Remove acquiring the list_lock in v4l2_async_notifier_init().
B. Move the call to v4l2_async_notifier_init()**to the top of
rvin_mc_parse_of_graph() (before acquiring group->lock).
It's most likely safe to remove the list_lock from
v4l2_async_notifier_init(), because all drivers should be calling that
function at probe start, before it begins to add async subdev
descriptors to their notifiers. But just the same, I think it would be
safer to keep list_lock in v4l2_async_notifier_init(), just in case of
some strange corner case (such as a driver that adds descriptors in a
separate thread from the thread that calls v4l2_async_notifier_init()).
Well, on second thought that's probably a lame example, no driver should
be doing that. So removing the list_lock from v4l2_async_notifier_init()
is probably safe. The notifier is not registered with v4l2-async at that
point.
Steve
So I would prefer B, but I'm open to either solution.
**
I traced the issue back to [1]. I don't believe this is any real trouble
here unless any of the async callbacks where to call into the async
framework themself which would trigger further calls to driver
callbacks, or maybe I'm naive and this is a real problem today.
Even if it's no real problem today I'm not sure this is never going to
be a problem, it would be nice if this warning could be handled somehow.
It is my understanding that any implementation of the async callbacks
who take a driver specific lock would trigger this warning which is not
nice.
It has always been the case that v4l2-async holds the list_lock when
it calls a driver's async bound callback.
So the problem is only that v4l2_async_notifier_init() acquires the
list_lock, which can create the reverse lock order
driver-specific-lock -> list_lock. Which is solved by either A or B
above.
Steve
Any suggestions or hints on how to move forward with this would be
appreciated.
1. eae2aed1eab9bf08 ("media: v4l2-fwnode: Switch to
v4l2_async_notifier_add_subdev")
---->> Warning output <<----
======================================================
WARNING: possible circular locking dependency detected
4.19.0-rc1-arm64-renesas-00212-geae2aed1eab9bf08 #56 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
(____ptrval____) (&group->lock){+.+.}, at:
rvin_group_notify_bound+0x30/0xa8
but task is already holding lock:
(____ptrval____) (list_lock){+.+.}, at:
__v4l2_async_notifier_register+0x54/0x1b0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (list_lock){+.+.}:
__mutex_lock+0x70/0x7f0
mutex_lock_nested+0x1c/0x28
v4l2_async_notifier_init+0x28/0x48
rcar_vin_probe+0x13c/0x630
platform_drv_probe+0x50/0xa0
really_probe+0x1e0/0x298
driver_probe_device+0x54/0xe8
__driver_attach+0xf0/0xf8
bus_for_each_dev+0x70/0xc0
driver_attach+0x20/0x28
bus_add_driver+0x1d4/0x200
driver_register+0x60/0x110
__platform_driver_register+0x44/0x50
rcar_vin_driver_init+0x18/0x20
do_one_initcall+0x180/0x35c
kernel_init_freeable+0x454/0x4f8
kernel_init+0x10/0xfc
ret_from_fork+0x10/0x1c
-> #0 (&group->lock){+.+.}:
lock_acquire+0xc8/0x238
__mutex_lock+0x70/0x7f0
mutex_lock_nested+0x1c/0x28
rvin_group_notify_bound+0x30/0xa8
v4l2_async_match_notify+0x50/0x138
v4l2_async_notifier_try_all_subdevs+0x58/0xb8
__v4l2_async_notifier_register+0xdc/0x1b0
v4l2_async_notifier_register+0x38/0x58
rcar_vin_probe+0x1b8/0x630
platform_drv_probe+0x50/0xa0
really_probe+0x1e0/0x298
driver_probe_device+0x54/0xe8
__driver_attach+0xf0/0xf8
bus_for_each_dev+0x70/0xc0
driver_attach+0x20/0x28
bus_add_driver+0x1d4/0x200
driver_register+0x60/0x110
__platform_driver_register+0x44/0x50
rcar_vin_driver_init+0x18/0x20
do_one_initcall+0x180/0x35c
kernel_init_freeable+0x454/0x4f8
kernel_init+0x10/0xfc
ret_from_fork+0x10/0x1c
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(list_lock);
lock(&group->lock);
lock(list_lock);
lock(&group->lock);
*** DEADLOCK ***
2 locks held by swapper/0/1:
#0: (____ptrval____) (&dev->mutex){....}, at:
__driver_attach+0x60/0xf8
#1: (____ptrval____) (list_lock){+.+.}, at:
__v4l2_async_notifier_register+0x54/0x1b0
stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted
4.19.0-rc1-arm64-renesas-00212-geae2aed1eab9bf08 #56
Hardware name: Renesas Salvator-X 2nd version board based on
r8a77965 (DT)
Call trace:
dump_backtrace+0x0/0x188
show_stack+0x14/0x20
dump_stack+0xbc/0xf4
print_circular_bug.isra.18+0x270/0x2d8
__lock_acquire+0x12e8/0x17c8
lock_acquire+0xc8/0x238
__mutex_lock+0x70/0x7f0
mutex_lock_nested+0x1c/0x28
rvin_group_notify_bound+0x30/0xa8
v4l2_async_match_notify+0x50/0x138
v4l2_async_notifier_try_all_subdevs+0x58/0xb8
__v4l2_async_notifier_register+0xdc/0x1b0
v4l2_async_notifier_register+0x38/0x58
rcar_vin_probe+0x1b8/0x630
platform_drv_probe+0x50/0xa0
really_probe+0x1e0/0x298
driver_probe_device+0x54/0xe8
__driver_attach+0xf0/0xf8
bus_for_each_dev+0x70/0xc0
driver_attach+0x20/0x28
bus_add_driver+0x1d4/0x200
driver_register+0x60/0x110
__platform_driver_register+0x44/0x50
rcar_vin_driver_init+0x18/0x20
do_one_initcall+0x180/0x35c
kernel_init_freeable+0x454/0x4f8
kernel_init+0x10/0xfc
ret_from_fork+0x10/0x1c