Re: s2idle blocked on dev->power.completion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Geert,

On Fri, Mar 14, 2025 at 10:05 AM Geert Uytterhoeven
<geert@xxxxxxxxxxxxxx> wrote:
>
> Hi Rafael,
>
> On Thu, 13 Mar 2025 at 18:35, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > On Thu, Mar 13, 2025 at 6:27 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > > On Thu, Mar 13, 2025 at 5:48 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > On Thu, 13 Mar 2025 at 15:32, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > > On Thu, 13 Feb 2025 at 11:26, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > > > On Thu, 13 Feb 2025 at 09:31, Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > > > > > > On Mon, Feb 10, 2025 at 2:24 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > > > > > On Fri, 7 Feb 2025 at 16:08, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > > > > > > Instrumenting all dev->power.completion accesses in
> > > > > > > > > drivers/base/power/main.c reveals that resume is blocked in dpm_wait()
> > > > > > > > > in the call to wait_for_completion() for regulator-1p2v, which is
> > > > > > > > > indeed a dependency for the SN65DSI86 DSI-DP bridge.  Comparing
> > > > > > > >
> > > > > > > > [...]
> > > > > > > >
> > > > > > > > > Looking at /sys/devices/virtual/devlink, the non-working case has the
> > > > > > > > > following extra entries:
> > > > > > > >
> > > > > > > > Note that the SN65DSI86 DSI-DP bridge driver uses the auxiliary bus
> > > > > > > > to create four subdevices:
> > > > > > > >   - ti_sn65dsi86.aux.0,
> > > > > > > >   - ti_sn65dsi86.bridge.0,
> > > > > > > >   - ti_sn65dsi86.gpio.0,
> > > > > > > >   - ti_sn65dsi86.pwm.0.
> > > > > > > > None of them have supplier:* symlinks in sysfs, so perhaps that is
> > > > > > > > the root cause of the issue?
> > > > > > >
> > > > > > > Sorry, I haven't had time to look into this closely. Couple of
> > > > > > > questions/suggestions that might give you some answers.
> > > > > > >
> > > > > > > Is this an issue only happening for s2idle or for s2ram too? I'd guess
> > > > > > > both, but if not, that might tell you something?
> > > > > >
> > > > > > The two (very similar) boards I could reproduce the issue on do not
> > > > > > support s2ram yet.
> > > > > >
> > > > > > > The only reason the wait_for_completion() wouldn't work is because the
> > > > > > > supplier is not "completing"?
> > > > > >
> > > > > > Yes, the diff shows ca. 70 additional calls to "complete_all()" in the
> > > > > > good case.
> > > >
> > > > >   4. When the issue happens, /sys/devices/virtual/devlink shows 3
> > > > >      more links:
> > > > >        A. platform:feb00000.display is a supplier of platform:fed80000.dsi-encoder
> > > > >        B. platform:fed80000.dsi-encoder is a supplier of platform:feb00000.display
> > > >
> > > > Their status file report "dormant".
> > > >
> > > > >        C. i2c:1-002c is a supplier of platform:fed80000.dsi-encoder
> > > >
> > > > Its status file reports "available".
> > > >
> > > > >   5. What happens in dpm_noirq_resume_devices()?
> > > > >
> > > > >        /*
> > > > >         * Trigger the resume of "async" devices upfront so they don't have to
> > > > >         * wait for the "non-async" ones they don't depend on.
> > > > >         */
> > > > >         i2c-1 (i2c bus) and 1-002c (bridge device) are async,
> > > > >         thus triggered first.
> > > > >         After that, the remaining devices are resumed.
> > > > >
> > > > >      In the bad case:
> > > > >
> > > > >        device_resume_noirq(fed80000.dsi-encoder, async=false)
> > > > >          dpm_wait_for_superior()
> > > > >            parent soc: skipping wait_for_completion()
> > > > >            dpm_wait_for_suppliers()
> > > > >              supplier feb00000.display: skipped, DL_STATE_DORMANT
> > > > >              ^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > Cfr. extra link A above (harmless)
> > > > >
> > > > >              supplier e6150000.clock-controller: skipping wait_for_completion()
> > > > >              supplier 1-002c: wait_for_completion() => BLOCKED
> > > > >              ^^^^^^^^^^^^^^^
> > > > > Cfr. extra link C above, but the bridge device hasn't been resumed yet.
> > > >
> > > > Changing the test for "DL_STATE_DORMANT"[1] in dpm_wait_for_suppliers()
> > > > to also include "DL_STATE_AVAILABLE" makes it skip supplier 1-002c,
> > > > and fixes the issue.  Does that make sense?
> > >
> > > Good question.
> > >
> > > DL_STATE_AVAILABLE essentially means that the consumer hasn't been
> > > probed yet, but it doesn't mean that it can be suspended before its
> > > supplier.
> >
> > I really meant "resumed before its supplier", sorry for the confusion.
> >
> > Generally speaking, suppliers need to be resumed first regardless of
> > the status of the consumers.
>
> Exactly, and that's being violated here.
> Before resume, dpm_noirq_list contains:
>      [...]
>    - fed80000.dsi-encoder
>      [...]
>    - regulator-1p2v
>      [...]
>    - 1-002c
> which is the order in which devices are to be resumed.
>
> regulator-1p2v is a supplier of 1-002c: OK.
> 1-002c is a supplier of fed80000.dsi-encoder: NG.
>
> As devices are resumed in the inverse order they have been suspended
> before, suspend order is also wrong.  Hence checking also for
> DL_STATE_AVAILABLE would just fix the symptom, and not the cause, right?

Right.





[Index of Archives]     [Linux Samsung SOC]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux