On Mon, Nov 25, 2024 at 9:05 AM A. Sverdlin <alexander.sverdlin@xxxxxxxxxxx> wrote: > > From: Alexander Sverdlin <alexander.sverdlin@xxxxxxxxxxx> > > The problem apparetly has been known since the conversion to > raw_spin_lock() (commit 4dbada2be460 > ("gpio: omap: use raw locks for locking")). > > Symptom: > > [ BUG: Invalid wait context ] > 5.10.214 > ----------------------------- > swapper/1 is trying to lock: > (enable_lock){....}-{3:3}, at: clk_enable_lock > other info that might help us debug this: > context-{5:5} > 2 locks held by swapper/1: > #0: (&dev->mutex){....}-{4:4}, at: device_driver_attach > #1: (&bank->lock){....}-{2:2}, at: omap_gpio_set_config > stack backtrace: > CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.214 > Hardware name: Generic AM33XX (Flattened Device Tree) > unwind_backtrace > show_stack > __lock_acquire > lock_acquire.part.0 > _raw_spin_lock_irqsave > clk_enable_lock > clk_enable > omap_gpio_set_config > gpio_keys_setup_key > gpio_keys_probe > platform_drv_probe > really_probe > driver_probe_device > device_driver_attach > __driver_attach > bus_for_each_dev > bus_add_driver > driver_register > do_one_initcall > do_initcalls > kernel_init_freeable > kernel_init > > Problematic spin_lock_irqsave(&enable_lock, ...) is being called by > clk_enable()/clk_disable() in omap2_set_gpio_debounce() and > omap_clear_gpio_debounce(). > > For omap2_set_gpio_debounce() it's possible to move > raw_spin_lock_irqsave(&bank->lock, ...) inside omap2_set_gpio_debounce() > so that the locks nest as follows: > > clk_enable(bank->dbck) > raw_spin_lock_irqsave(&bank->lock, ...) > raw_spin_unlock_irqrestore() > clk_disable() > > Two call-sites of omap_clear_gpio_debounce() are more convoluted, but one > can take the advantage of the nesting nature of clk_enable()/clk_disable(), > so that the inner clk_disable() becomes lockless: > > clk_enable(bank->dbck) <-- only to clk_enable_lock() > raw_spin_lock_irqsave(&bank->lock, ...) > omap_clear_gpio_debounce() > clk_disable() <-- becomes lockless > raw_spin_unlock_irqrestore() > clk_disable() > > Cc: stable@xxxxxxxxxxxxxxx > Fixes: 4dbada2be460 ("gpio: omap: use raw locks for locking") > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@xxxxxxxxxxx> > --- > drivers/gpio/gpio-omap.c | 35 ++++++++++++++++++++++++++++++----- > 1 file changed, 30 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c > index 7ad4534054962..f9e502aa57753 100644 > --- a/drivers/gpio/gpio-omap.c > +++ b/drivers/gpio/gpio-omap.c > @@ -181,6 +181,7 @@ static inline void omap_gpio_dbck_disable(struct gpio_bank *bank) > static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset, > unsigned debounce) > { > + unsigned long flags; > u32 val; > u32 l; > bool enable = !!debounce; > @@ -196,13 +197,18 @@ static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset, > > l = BIT(offset); > > + /* > + * Ordering is important here: clk_enable() calls spin_lock_irqsave(), > + * therefore it must be outside of the following raw_spin_lock_irqsave() > + */ > clk_enable(bank->dbck); > + raw_spin_lock_irqsave(&bank->lock, flags); > + > writel_relaxed(debounce, bank->base + bank->regs->debounce); > > val = omap_gpio_rmw(bank->base + bank->regs->debounce_en, l, enable); > bank->dbck_enable_mask = val; > > - clk_disable(bank->dbck); > /* > * Enable debounce clock per module. > * This call is mandatory because in omap_gpio_request() when > @@ -217,6 +223,9 @@ static int omap2_set_gpio_debounce(struct gpio_bank *bank, unsigned offset, > bank->context.debounce_en = val; > } > > + raw_spin_unlock_irqrestore(&bank->lock, flags); > + clk_disable(bank->dbck); > + This part looks pretty clear to me. > return 0; > } > > @@ -647,6 +656,13 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) > unsigned long flags; > unsigned offset = d->hwirq; > > + /* > + * Enable the clock here so that the nested clk_disable() in the > + * following omap_clear_gpio_debounce() is lockless > + */ > + if (bank->dbck_flag) > + clk_enable(bank->dbck); > + But this looks like a functional change. You effectively bump the clock enable count but don't add a corresponding clk_disable() in the affected path. Is the clock ever actually disabled then? Am I not getting something? Bart > raw_spin_lock_irqsave(&bank->lock, flags); > bank->irq_usage &= ~(BIT(offset)); > omap_set_gpio_triggering(bank, offset, IRQ_TYPE_NONE); > @@ -656,6 +672,9 @@ static void omap_gpio_irq_shutdown(struct irq_data *d) > omap_clear_gpio_debounce(bank, offset); > omap_disable_gpio_module(bank, offset); > raw_spin_unlock_irqrestore(&bank->lock, flags); > + > + if (bank->dbck_flag) > + clk_disable(bank->dbck); > } > > static void omap_gpio_irq_bus_lock(struct irq_data *data) > @@ -827,6 +846,13 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) > struct gpio_bank *bank = gpiochip_get_data(chip); > unsigned long flags; > > + /* > + * Enable the clock here so that the nested clk_disable() in the > + * following omap_clear_gpio_debounce() is lockless > + */ > + if (bank->dbck_flag) > + clk_enable(bank->dbck); > + > raw_spin_lock_irqsave(&bank->lock, flags); > bank->mod_usage &= ~(BIT(offset)); > if (!LINE_USED(bank->irq_usage, offset)) { > @@ -836,6 +862,9 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset) > omap_disable_gpio_module(bank, offset); > raw_spin_unlock_irqrestore(&bank->lock, flags); > > + if (bank->dbck_flag) > + clk_disable(bank->dbck); > + > pm_runtime_put(chip->parent); > } > > @@ -913,15 +942,11 @@ static int omap_gpio_debounce(struct gpio_chip *chip, unsigned offset, > unsigned debounce) > { > struct gpio_bank *bank; > - unsigned long flags; > int ret; > > bank = gpiochip_get_data(chip); > > - raw_spin_lock_irqsave(&bank->lock, flags); > ret = omap2_set_gpio_debounce(bank, offset, debounce); > - raw_spin_unlock_irqrestore(&bank->lock, flags); > - > if (ret) > dev_info(chip->parent, > "Could not set line %u debounce to %u microseconds (%d)", > -- > 2.47.0 >