Re: [PATCH 01/18] pwm: Provide devm_pwmchip_alloc() function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 10, 2023 at 1:05 AM Uwe Kleine-König
<u.kleine-koenig@xxxxxxxxxxxxxx> wrote:
>
> Hello Saravana,
>
> you were pointed out to me as the expert for device links. I found a
> problem with these.
>
> On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> > Today I managed to trigger the problem I intend to address with this
> > series. My machine to test this on is an stm32mp157. To be able to
> > trigger the problem reliably I applied the following patches on top of
> > v6.5-rc1:
> >
> >  - pwm: stm32: Don't modify HW state in .remove() callback
> >    This is a cleanup that I already sent out.
> >    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@xxxxxxxxxxxxxx
> >    The purpose for reproducing the problem is to not trigger further
> >    calls to the apply callback.
> >
> >  - The following patch:
> >
> > diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> > index 687967d3265f..c7fc02b0fa3c 100644
> > --- a/drivers/pwm/pwm-stm32.c
> > +++ b/drivers/pwm/pwm-stm32.c
> > @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> >       struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
> >       int ret;
> >
> > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > +     msleep(5000);
> > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > +
> >       enabled = pwm->state.enabled;
> >
> >       if (enabled && !state->enabled) {
> > @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
> >  {
> >       struct stm32_pwm *priv = platform_get_drvdata(pdev);
> >
> > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> >       pwmchip_remove(&priv->chip);
> > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > +
> > +     priv->regmap = NULL;
> >  }
> >
> >  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> >
> > The first hunk is only there to widen the race window. The second is to
> > give some diagnostics and make stm32_pwm_apply() crash if it continues
> > to run after the msleep. (Without it it didn't crash reproducibly, don't
> > understand why. *shrug*)
> >
> > The device tree contains a pwm-fan device making use of one of the PWMs.
> >
> > Now I do the following:
> >
> >       echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> >
> > Unbinding the fan device has two effects:
> >
> >  - The device link between fan and pwm looses its property to unbind fan
> >    when pwm gets unbound.
> >    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
> >  - It calls pwm_fan_cleanup() which triggers a call to
> >    pwm_apply_state().
> >
> > So when the pwm device gets unbound the first thread is sleeping in
> > stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> > priv->regmap to NULL. Then a few seconds later the first thread wakes up
> > in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> >
> > This looks as follows:
> >
> > root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> > [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> > [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> > root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> > [  192.240727] 8<--- cut here ---
> > [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> > ...
> >
> > Even without the crash you can see that stm32_pwm_apply() is still
> > running after pwmchip_remove() completed.
> >
> > I'm unsure if the device link could be improved here to ensure that the
> > fan is completely unbound even if it started unbinding already before
> > the pwm device gets unbound. (And if it could, would this fit the device
> > links purpose and so be a sensible improvement?)
>
> While I think that there is something to be done in the pwm core that
> this doesn't explode (i.e. do proper lifetime tracking such that a
> pwm_chip doesn't disappear while still being used---and I'm working on
> that) I expected that the device links between pwm consumer and provider
> would prevent the above described oops, too. But somehow the fan already
> going away (but still using the PWM) when the PWM is unbound, results in
> the PWM disappearing before the fan is completely gone.
>
> Is this expected, or a problem that can (and should?) be fixed?

I didn't read your full series, but I read this email. With what's in
this email, the problem seems to be in the driver or the pwm
framework. The pwm driver/framework can't tell the driver core that
you successfully unbound (returning from .remove()) before you have
finish all your ongoing transactions with the device. If your
"apply()" is still running, you need to make sure it's complete before
.remove() does any resource releasing/clean up.

Also, how is the consumer driver's .remove() succeeding if it has an
ongoing pwm call()? This all sounds like insufficient locking and
critical region protection in both the consumer and supplier.

Device links can't do anything here because you are giving it wrong
info -- that the unbind was successful before it actually is.

Device links will and can make sure that the consumer is unbound
successfully before the unbind is called on the supplier. And it looks
like that's still true here.

-Saravana

>
> If you need more context or a tester, don't hesitate to ask.
>
> Best regards
> Uwe
>
> --
> Pengutronix e.K.                           | Uwe Kleine-König            |
> Industrial Linux Solutions                 | https://www.pengutronix.de/ |





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux