On 11/1/24 12:53 AM, Manivannan Sadhasivam wrote:
On Thu, Oct 31, 2024 at 02:26:24PM -0700, Bart Van Assche wrote:
The RTC update work involves runtime resuming the UFS controller. Hence,
only start the RTC update work after runtime power management in the UFS
driver has been fully initialized. This patch fixes the following kernel
crash:
Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
Workqueue: events ufshcd_rtc_work
Call trace:
_raw_spin_lock_irqsave+0x34/0x8c (P)
pm_runtime_get_if_active+0x24/0x9c (L)
pm_runtime_get_if_active+0x24/0x9c
ufshcd_rtc_work+0x138/0x1b4
process_one_work+0x148/0x288
worker_thread+0x2cc/0x3d4
kthread+0x110/0x114
ret_from_fork+0x10/0x20
Reported-by: Neil Armstrong <neil.armstrong@xxxxxxxxxx>
Closes: https://lore.kernel.org/linux-scsi/0c0bc528-fdc2-4106-bc99-f23ae377f6f5@xxxxxxxxxx/
Fixes: 6bf999e0eb41 ("scsi: ufs: core: Add UFS RTC support")
Cc: Bean Huo <beanhuo@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
Bart, Thanks for the fix! While looking into this patch, I also found the
weirdness of the ufshcd_rpm_*() helpers in ufshcd-priv.h. Their naming doesn't
seem to indicate whether those helpers are for WLUN or for HBA. Also, I don't
see the benefit of these helpers since they just wrap generic pm_runtime*
calls. Then there are other open coding instances in the ufshcd.c. Like
pm_runtime_suspended(&hba->ufs_device_wlun->sdev_gendev)
pm_runtime_set_active(&hba->ufs_device_wlun->sdev_gendev)
Moreover, we do check for the presence of hba->ufs_device_wlun before calling
ufshcd_rpm_get_sync() in ufshcd_remove(). This could be one other way to fix
this null ptr dereference even though I wouldn't recommend doing so as calling
rtc_work early is pointless.
So I think we should remove these helpers to avoid having these discrepancies.
WDYT?
Hi Manivannan,
In the context of the Linux kernel, in general, one-line helper
functions are considered questionable. In this case I prefer to keep the
helper functions since these encapsulate an implementation detail,
namely that the WLUN sdev_gendev member is used to control runtime power
management of the UFS host controller.
Checking whether or not the hba->ufs_device_wlun pointer is NULL from
the ufshcd_rtc_work() function would be racy since that pointer is
modified from another thread. So I prefer the patch at the start of this
thread instead of adding a hba->ufs_device_wlun pointer check.
Thanks,
Bart.