On Thu, Mar 13, 2025 at 6:27 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > Hi Geert, > > On Thu, Mar 13, 2025 at 5:48 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > Hi Saravana, > > > > On Thu, 13 Mar 2025 at 15:32, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > On Thu, 13 Feb 2025 at 11:26, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > On Thu, 13 Feb 2025 at 09:31, Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > > > > On Mon, Feb 10, 2025 at 2:24 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > > > On Fri, 7 Feb 2025 at 16:08, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > > > > Instrumenting all dev->power.completion accesses in > > > > > > > drivers/base/power/main.c reveals that resume is blocked in dpm_wait() > > > > > > > in the call to wait_for_completion() for regulator-1p2v, which is > > > > > > > indeed a dependency for the SN65DSI86 DSI-DP bridge. Comparing > > > > > > > > > > > > [...] > > > > > > > > > > > > > Looking at /sys/devices/virtual/devlink, the non-working case has the > > > > > > > following extra entries: > > > > > > > > > > > > Note that the SN65DSI86 DSI-DP bridge driver uses the auxiliary bus > > > > > > to create four subdevices: > > > > > > - ti_sn65dsi86.aux.0, > > > > > > - ti_sn65dsi86.bridge.0, > > > > > > - ti_sn65dsi86.gpio.0, > > > > > > - ti_sn65dsi86.pwm.0. > > > > > > None of them have supplier:* symlinks in sysfs, so perhaps that is > > > > > > the root cause of the issue? > > > > > > > > > > Sorry, I haven't had time to look into this closely. Couple of > > > > > questions/suggestions that might give you some answers. > > > > > > > > > > Is this an issue only happening for s2idle or for s2ram too? I'd guess > > > > > both, but if not, that might tell you something? > > > > > > > > The two (very similar) boards I could reproduce the issue on do not > > > > support s2ram yet. > > > > > > > > > The only reason the wait_for_completion() wouldn't work is because the > > > > > supplier is not "completing"? > > > > > > > > Yes, the diff shows ca. 70 additional calls to "complete_all()" in the > > > > good case. > > > > > 4. When the issue happens, /sys/devices/virtual/devlink shows 3 > > > more links: > > > A. platform:feb00000.display is a supplier of platform:fed80000.dsi-encoder > > > B. platform:fed80000.dsi-encoder is a supplier of platform:feb00000.display > > > > Their status file report "dormant". > > > > > C. i2c:1-002c is a supplier of platform:fed80000.dsi-encoder > > > > Its status file reports "available". > > > > > 5. What happens in dpm_noirq_resume_devices()? > > > > > > /* > > > * Trigger the resume of "async" devices upfront so they don't have to > > > * wait for the "non-async" ones they don't depend on. > > > */ > > > i2c-1 (i2c bus) and 1-002c (bridge device) are async, > > > thus triggered first. > > > After that, the remaining devices are resumed. > > > > > > In the bad case: > > > > > > device_resume_noirq(fed80000.dsi-encoder, async=false) > > > dpm_wait_for_superior() > > > parent soc: skipping wait_for_completion() > > > dpm_wait_for_suppliers() > > > supplier feb00000.display: skipped, DL_STATE_DORMANT > > > ^^^^^^^^^^^^^^^^^^^^^^^^^ > > > Cfr. extra link A above (harmless) > > > > > > supplier e6150000.clock-controller: skipping wait_for_completion() > > > supplier 1-002c: wait_for_completion() => BLOCKED > > > ^^^^^^^^^^^^^^^ > > > Cfr. extra link C above, but the bridge device hasn't been resumed yet. > > > > Changing the test for "DL_STATE_DORMANT"[1] in dpm_wait_for_suppliers() > > to also include "DL_STATE_AVAILABLE" makes it skip supplier 1-002c, > > and fixes the issue. Does that make sense? > > Good question. > > DL_STATE_AVAILABLE essentially means that the consumer hasn't been > probed yet, but it doesn't mean that it can be suspended before its > supplier. I really meant "resumed before its supplier", sorry for the confusion. Generally speaking, suppliers need to be resumed first regardless of the status of the consumers. Also, adding a device link between two devices by itself cannot create a deadlock unless it goes against some other dependency.