On Tue, Aug 13, 2024 at 6:28 PM Dmitry Osipenko <digetx@xxxxxxxxx> wrote: > > 13.08.2024 16:32, Breno Leitao пишет: > > Hello Andy, > > > > On Fri, Aug 09, 2024 at 02:03:27PM +0300, Andy Shevchenko wrote: > >> On Fri, Aug 9, 2024 at 2:57 AM Andi Shyti <andi.shyti@xxxxxxxxxx> wrote: > >>> On Thu, Aug 08, 2024 at 05:14:46AM GMT, Breno Leitao wrote: > > > >>>> The problem arises because during __pm_runtime_resume(), the spinlock > >>>> &dev->power.lock is acquired before rpm_resume() is called. Later, > >>>> rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on > >>>> mutexes, triggering the error. > >>>> > >>>> To address this issue, devices on ACPI are now marked as not IRQ-safe, > >>>> considering the dependency of acpi_subsys_runtime_resume() on mutexes. > >> > >> This is a step in the right direction > > > > Thanks > > > >> but somewhere in the replies > >> here I would like to hear about roadmap to get rid of the > >> pm_runtime_irq_safe() in all Tegra related code. > > > > Agree, that seems the right way to go, but this is a question to > > maintainers, Laxman and Dmitry. > > > > By the way, looking at lore, I found that the last email from Laxman is > > from 2022. And Dmitry seems to be using a different email!? Let me copy > > the Dmitry's other email (dmitry.osipenko@xxxxxxxxxxxxx) here. > > > >>>> + if (!IS_VI(i2c_dev) && !ACPI_HANDLE(i2c_dev->dev)) > >>> > >>> looks good to me, can I have an ack from Andy here? > >> > >> I prefer to see something like > >> is_acpi_node() / is_acpi_device_node() / is_acpi_data_node() / > >> has_acpi_companion() > >> instead depending on the actual ACPI representation of the device. > >> > >> Otherwise no objections. > >> Please, Cc me (andy@xxxxxxxxxx) for the next version. > > > > Thanks for the feedback, I agree that leveraging the functions about > > should be better. What about something as: > > > > Author: Breno Leitao <leitao@xxxxxxxxxx> > > Date: Thu Jun 6 06:27:07 2024 -0700 > > > > Do not mark ACPI devices as irq safe > > > > On ACPI machines, the tegra i2c module encounters an issue due to a > > mutex being called inside a spinlock. This leads to the following bug: > > > > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585 > > in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1282, name: kssif0010 > > preempt_count: 0, expected: 0 > > RCU nest depth: 0, expected: 0 > > irq event stamp: 0 > > > > Call trace: > > __might_sleep > > __mutex_lock_common > > mutex_lock_nested > > acpi_subsys_runtime_resume > > rpm_resume > > tegra_i2c_xfer > > > > The problem arises because during __pm_runtime_resume(), the spinlock > > &dev->power.lock is acquired before rpm_resume() is called. Later, > > rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on > > mutexes, triggering the error. > > > > To address this issue, devices on ACPI are now marked as not IRQ-safe, > > considering the dependency of acpi_subsys_runtime_resume() on mutexes. > > > > Co-developed-by: Michael van der Westhuizen <rmikey@xxxxxxxx> > > Signed-off-by: Michael van der Westhuizen <rmikey@xxxxxxxx> > > Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx> > > > > diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c > > index 85b31edc558d..1df5b4204142 100644 > > --- a/drivers/i2c/busses/i2c-tegra.c > > +++ b/drivers/i2c/busses/i2c-tegra.c > > @@ -1802,9 +1802,9 @@ static int tegra_i2c_probe(struct platform_device *pdev) > > * domain. > > * > > * VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't > > - * be used for atomic transfers. > > + * be used for atomic transfers. ACPI device is not IRQ safe also. > > */ > > - if (!IS_VI(i2c_dev)) > > + if (!IS_VI(i2c_dev) && !has_acpi_companion(i2c_dev->dev)) > > pm_runtime_irq_safe(i2c_dev->dev); > > > > pm_runtime_enable(i2c_dev->dev); > > > > Looks good, thanks > > Reviewed-by: Dmitry Osipenko <digetx@xxxxxxxxx> LGTM as well, feel free to add Reviewed-by: Andy Shevchenko <andy@xxxxxxxxxx> to the above when sending it formally. > > but somewhere in the replies > > here I would like to hear about roadmap to get rid of the > > pm_runtime_irq_safe() in all Tegra related code. > > What is the problem with pm_runtime_irq_safe()? It's a hack. It has no reasons to stay in the kernel. It also prevents PM from working properly (in some cases, not Tegra). > There were multiple > problems with RPM for this driver in the past, it wasn't trivial to make > it work for all Tegra HW generations. Don't expect anyone would want to > invest time into doing it all over again. You may always refer to the OMAP case, which used to have 12 (IIRC, but definitely several) calls to this API and now 0. Taking the OMAP case into consideration I believe it's quite possible to get rid of this hack and retire the API completely. Yes, this may take months or even years. But I would like to have this roadmap be documented. -- With Best Regards, Andy Shevchenko