On 29/04/2022 14:29, Marek Szyprowski wrote: > Hi Krzysztof, > > On 19.04.2022 13:34, Krzysztof Kozlowski wrote: >> The driver_override field from platform driver should not be initialized >> from static memory (string literal) because the core later kfree() it, >> for example when driver_override is set via sysfs. >> >> Use dedicated helper to set driver_override properly. >> >> Fixes: 950a7388f02b ("rpmsg: Turn name service into a stand alone driver") >> Fixes: c0cdc19f84a4 ("rpmsg: Driver for user space endpoint interface") >> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx> >> Reviewed-by: Bjorn Andersson <bjorn.andersson@xxxxxxxxxx> > > This patch landed recently in linux-next as commit 42cd402b8fd4 ("rpmsg: > Fix kfree() of static memory on setting driver_override"). In my tests I > found that it triggers the following issue during boot of the > DragonBoard410c SBC (arch/arm64/boot/dts/qcom/apq8016-sbc.dtb): > > ------------[ cut here ]------------ > DEBUG_LOCKS_WARN_ON(lock->magic != lock) > WARNING: CPU: 1 PID: 8 at kernel/locking/mutex.c:582 > __mutex_lock+0x1ec/0x430 > Modules linked in: > CPU: 1 PID: 8 Comm: kworker/u8:0 Not tainted 5.18.0-rc4-next-20220429 #11815 > Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) > Workqueue: events_unbound deferred_probe_work_func > pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : __mutex_lock+0x1ec/0x430 > lr : __mutex_lock+0x1ec/0x430 > .. > Call trace: > __mutex_lock+0x1ec/0x430 > mutex_lock_nested+0x38/0x64 > driver_set_override+0x124/0x150 > qcom_smd_register_edge+0x2a8/0x4ec > qcom_smd_probe+0x54/0x80 > platform_probe+0x68/0xe0 > really_probe.part.0+0x9c/0x29c > __driver_probe_device+0x98/0x144 > driver_probe_device+0xac/0x14c > __device_attach_driver+0xb8/0x120 > bus_for_each_drv+0x78/0xd0 > __device_attach+0xd8/0x180 > device_initial_probe+0x14/0x20 > bus_probe_device+0x9c/0xa4 > deferred_probe_work_func+0x88/0xc4 > process_one_work+0x288/0x6bc > worker_thread+0x248/0x450 > kthread+0x118/0x11c > ret_from_fork+0x10/0x20 > irq event stamp: 3599 > hardirqs last enabled at (3599): [<ffff80000919053c>] > _raw_spin_unlock_irqrestore+0x98/0x9c > hardirqs last disabled at (3598): [<ffff800009190ba4>] > _raw_spin_lock_irqsave+0xc0/0xcc > softirqs last enabled at (3554): [<ffff800008010470>] _stext+0x470/0x5e8 > softirqs last disabled at (3549): [<ffff8000080a4514>] > __irq_exit_rcu+0x180/0x1ac > ---[ end trace 0000000000000000 ]--- > > I don't see any direct relation between the $subject and the above log, > but reverting the $subject on top of linux next-20220429 hides/fixes it. > Maybe there is a kind of memory trashing somewhere there and your change > only revealed it? Thanks for the report. I think the error path of my patch is wrong - I should not kfree(rpdev->driver_override) from the rpmsg code. That's the only thing I see now... Could you test following patch and tell if it helps? https://pastebin.ubuntu.com/p/rp3q9Z5fXj/ ----- diff --git a/drivers/rpmsg/rpmsg_internal.h b/drivers/rpmsg/rpmsg_internal.h index 3e81642238d2..1e2ad944e2ec 100644 --- a/drivers/rpmsg/rpmsg_internal.h +++ b/drivers/rpmsg/rpmsg_internal.h @@ -102,11 +102,7 @@ static inline int rpmsg_ctrldev_register_device(struct rpmsg_device *rpdev) if (ret) return ret; - ret = rpmsg_register_device(rpdev); - if (ret) - kfree(rpdev->driver_override); - - return ret; + return rpmsg_register_device(rpdev); } #endif diff --git a/drivers/rpmsg/rpmsg_ns.c b/drivers/rpmsg/rpmsg_ns.c index 8eb8f328237e..f26078467899 100644 --- a/drivers/rpmsg/rpmsg_ns.c +++ b/drivers/rpmsg/rpmsg_ns.c @@ -31,11 +31,7 @@ int rpmsg_ns_register_device(struct rpmsg_device *rpdev) rpdev->src = RPMSG_NS_ADDR; rpdev->dst = RPMSG_NS_ADDR; - ret = rpmsg_register_device(rpdev); - if (ret) - kfree(rpdev->driver_override); - - return ret; + return rpmsg_register_device(rpdev); } EXPORT_SYMBOL(rpmsg_ns_register_device);