Hello Andy, On Fri, Aug 09, 2024 at 02:03:27PM +0300, Andy Shevchenko wrote: > On Fri, Aug 9, 2024 at 2:57 AM Andi Shyti <andi.shyti@xxxxxxxxxx> wrote: > > On Thu, Aug 08, 2024 at 05:14:46AM GMT, Breno Leitao wrote: > > > The problem arises because during __pm_runtime_resume(), the spinlock > > > &dev->power.lock is acquired before rpm_resume() is called. Later, > > > rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on > > > mutexes, triggering the error. > > > > > > To address this issue, devices on ACPI are now marked as not IRQ-safe, > > > considering the dependency of acpi_subsys_runtime_resume() on mutexes. > > This is a step in the right direction Thanks > but somewhere in the replies > here I would like to hear about roadmap to get rid of the > pm_runtime_irq_safe() in all Tegra related code. Agree, that seems the right way to go, but this is a question to maintainers, Laxman and Dmitry. By the way, looking at lore, I found that the last email from Laxman is from 2022. And Dmitry seems to be using a different email!? Let me copy the Dmitry's other email (dmitry.osipenko@xxxxxxxxxxxxx) here. > > > + if (!IS_VI(i2c_dev) && !ACPI_HANDLE(i2c_dev->dev)) > > > > looks good to me, can I have an ack from Andy here? > > I prefer to see something like > is_acpi_node() / is_acpi_device_node() / is_acpi_data_node() / > has_acpi_companion() > instead depending on the actual ACPI representation of the device. > > Otherwise no objections. > Please, Cc me (andy@xxxxxxxxxx) for the next version. Thanks for the feedback, I agree that leveraging the functions about should be better. What about something as: Author: Breno Leitao <leitao@xxxxxxxxxx> Date: Thu Jun 6 06:27:07 2024 -0700 Do not mark ACPI devices as irq safe On ACPI machines, the tegra i2c module encounters an issue due to a mutex being called inside a spinlock. This leads to the following bug: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1282, name: kssif0010 preempt_count: 0, expected: 0 RCU nest depth: 0, expected: 0 irq event stamp: 0 Call trace: __might_sleep __mutex_lock_common mutex_lock_nested acpi_subsys_runtime_resume rpm_resume tegra_i2c_xfer The problem arises because during __pm_runtime_resume(), the spinlock &dev->power.lock is acquired before rpm_resume() is called. Later, rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on mutexes, triggering the error. To address this issue, devices on ACPI are now marked as not IRQ-safe, considering the dependency of acpi_subsys_runtime_resume() on mutexes. Co-developed-by: Michael van der Westhuizen <rmikey@xxxxxxxx> Signed-off-by: Michael van der Westhuizen <rmikey@xxxxxxxx> Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx> diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c index 85b31edc558d..1df5b4204142 100644 --- a/drivers/i2c/busses/i2c-tegra.c +++ b/drivers/i2c/busses/i2c-tegra.c @@ -1802,9 +1802,9 @@ static int tegra_i2c_probe(struct platform_device *pdev) * domain. * * VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't - * be used for atomic transfers. + * be used for atomic transfers. ACPI device is not IRQ safe also. */ - if (!IS_VI(i2c_dev)) + if (!IS_VI(i2c_dev) && !has_acpi_companion(i2c_dev->dev)) pm_runtime_irq_safe(i2c_dev->dev); pm_runtime_enable(i2c_dev->dev);