On Mon, Feb 06, 2023 at 06:31:21PM +0200, Abel Vesa wrote: > On 23-02-02 18:24:15, Matthias Kaehlcke wrote: > > Hi Abel, > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > Currently, there are cases when a domain needs to remain enabled until > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > as modules. Since the genpd_power_off_unused is called too early for > > > such consumer driver modules to get a chance to probe, the domain, since > > > it is unused, will get disabled. On the other hand, the best time for > > > an unused domain to be disabled is on the provider's sync_state > > > callback. So, if the provider has registered a sync_state callback, > > > assume the unused domains for that provider will be disabled on its > > > sync_state callback. Also provide a generic sync_state callback which > > > disables all the domains unused for the provider that registers it. > > > > > > Signed-off-by: Abel Vesa <abel.vesa@xxxxxxxxxx> > > > --- > > > > > > This approach has been applied for unused clocks as well. > > > With this patch merged in, all the providers that have sync_state > > > callback registered will leave the domains enabled unless the provider's > > > sync_state callback explicitly disables them. So those providers will > > > need to add the disabling part to their sync_state callback. On the > > > other hand, the platforms that have cases where domains need to remain > > > enabled (even if unused) until the consumer driver probes, will be able, > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > I recently encountered a related issue on a Qualcomm platform with a > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > highest corner until sync_state"). The issue involves a DT node with a > > rpmhpd, the DT node is enabled, however the corresponding device driver > > is not enabled in the kernel. In such a scenario the sync_state callback > > is never called, because the genpd consumer never probes. As a result > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > system suspend, which results in a substantially higher power consumption > > in S3. > > If I get this correctly, one of the providers is missing (doesn't matter > the reason), in which case, your kernel needs that driver, period. There > is no reason why you would expect the consumer to work without the > provider. Or, you could just remove the property in the devicetree node, > the property that makes the consumer wait for that provider. Anyway, you > should never end up with a consumer provider relationship in devicetree > without providing the provider driver. I would agree if it was actually a provider that's missing, however it's a 'missing' consumer that prevents the sync_state() call. > > I wonder if genpd (and some other frameworks) needs something like > > regulator_init_complete(), which turns off unused regulators 30s after > > system boot. That's conceptually similar to the current > > genpd_power_off_unused(), but would provide time for modules being loaded. > > NACK, timeouts are just another hack in this case, specially when we > have a pretty reliable mechanism like sync_state. It does not work properly unless all consumers are probed successfully. It makes sense to wait some time for the consumers to probe, but not eternally, it's perfectly valid that a driver for a (potential) consumer is not enabled.