Hi, On Thu, Dec 12, 2024 at 12:26:42PM +0100, Ulf Hansson wrote: > On Thu, 12 Dec 2024 at 00:11, Peter Geis <pgwipeout@xxxxxxxxx> wrote: > > On Wed, Dec 11, 2024 at 3:46 PM Sebastian Reichel > > <sebastian.reichel@xxxxxxxxxxxxx> wrote: > > > On Wed, Dec 11, 2024 at 02:53:34PM -0500, Peter Geis wrote: > > > > On Wed, Dec 11, 2024 at 9:32 AM Sebastian Reichel > > > > <sebastian.reichel@xxxxxxxxxxxxx> wrote: > > > > > > > > > > Currently rockchip_do_pmu_set_power_domain prints a warning if there > > > > > have been errors turning on the power domain, but it does not return > > > > > any errors and rockchip_pd_power() tries to continue setting up the > > > > > QOS registers. This usually results in accessing unpowered registers, > > > > > which triggers an SError and a full system hang. > > > > > > > > > > This improves the error handling by forwarding the error to avoid > > > > > kernel panics. > > > > > > > > I think we should merge your patch here with my patch for returning > > > > errors from rockchip_pmu_set_idle_request [1]. > > > > > > I will have a look. > > > > > > > > Reviewed-by: Heiko Stuebner <heiko@xxxxxxxxx> > > > > > Tested-by: Heiko Stuebner <heiko@xxxxxxxxx> > > > > > Tested-by: Adrian Larumbe <adrian.larumbe@xxxxxxxxxxxxx> # On Rock 5B > > > > > Signed-off-by: Sebastian Reichel <sebastian.reichel@xxxxxxxxxxxxx> > > > > > --- > > > > > drivers/pmdomain/rockchip/pm-domains.c | 34 +++++++++++++++++--------- > > > > > 1 file changed, 22 insertions(+), 12 deletions(-) > > > > > > > > > > diff --git a/drivers/pmdomain/rockchip/pm-domains.c b/drivers/pmdomain/rockchip/pm-domains.c > > > > > index a161ee13c633..8f440f2883db 100644 > > > > > --- a/drivers/pmdomain/rockchip/pm-domains.c > > > > > +++ b/drivers/pmdomain/rockchip/pm-domains.c > > > > > @@ -533,16 +533,17 @@ static int rockchip_pmu_domain_mem_reset(struct rockchip_pm_domain *pd) > > > > > return ret; > > > > > } > > > > > > > > > > -static void rockchip_do_pmu_set_power_domain(struct rockchip_pm_domain *pd, > > > > > - bool on) > > > > > +static int rockchip_do_pmu_set_power_domain(struct rockchip_pm_domain *pd, > > > > > + bool on) > > > > > { > > > > > struct rockchip_pmu *pmu = pd->pmu; > > > > > struct generic_pm_domain *genpd = &pd->genpd; > > > > > u32 pd_pwr_offset = pd->info->pwr_offset; > > > > > bool is_on, is_mem_on = false; > > > > > + int ret; > > > > > > > > > > if (pd->info->pwr_mask == 0) > > > > > - return; > > > > > + return 0; > > > > > > > > > > if (on && pd->info->mem_status_mask) > > > > > is_mem_on = rockchip_pmu_domain_is_mem_on(pd); > > > > > @@ -557,16 +558,21 @@ static void rockchip_do_pmu_set_power_domain(struct rockchip_pm_domain *pd, > > > > > > > > > > wmb(); > > > > > > > > > > - if (is_mem_on && rockchip_pmu_domain_mem_reset(pd)) > > > > > - return; > > > > > + if (is_mem_on) { > > > > > + ret = rockchip_pmu_domain_mem_reset(pd); > > > > > + if (ret) > > > > > + return ret; > > > > > + } > > > > > > > > > > - if (readx_poll_timeout_atomic(rockchip_pmu_domain_is_on, pd, is_on, > > > > > - is_on == on, 0, 10000)) { > > > > > - dev_err(pmu->dev, > > > > > - "failed to set domain '%s', val=%d\n", > > > > > - genpd->name, is_on); > > > > > - return; > > > > > + ret = readx_poll_timeout_atomic(rockchip_pmu_domain_is_on, pd, is_on, > > > > > + is_on == on, 0, 10000); > > > > > + if (ret) { > > > > > + dev_err(pmu->dev, "failed to set domain '%s' %s, val=%d\n", > > > > > + genpd->name, on ? "on" : "off", is_on); > > > > > + return ret; > > > > > } > > > > > + > > > > > + return 0; > > > > > } > > > > > > > > > > static int rockchip_pd_power(struct rockchip_pm_domain *pd, bool power_on) > > > > > @@ -592,7 +598,11 @@ static int rockchip_pd_power(struct rockchip_pm_domain *pd, bool power_on) > > > > > rockchip_pmu_set_idle_request(pd, true); > > > > > } > > > > > > > > > > - rockchip_do_pmu_set_power_domain(pd, power_on); > > > > > + ret = rockchip_do_pmu_set_power_domain(pd, power_on); > > > > > + if (ret < 0) { > > > > > + clk_bulk_disable(pd->num_clks, pd->clks); > > > > > + return ret; > > > > > > > > Looking at it, we shouldn't return directly from here because the > > > > mutex never gets unlocked. > > > > > > Yes, we should do that after patch 2/7 from this series :) > > > > That's excellent! > > > > > > > > > Instead of repeating clk_bulk_disable and return ret for each failure, > > > > we can initialize ret = 0, have a goto: out pointing to > > > > clk_bulk_disable, and change return 0 to return ret at the end. > > > > > > Right now there is only a single clk_bulk_disable() in an error > > > case, so I did not use the typical error goto chain. I suppose > > > it makes a lot more sense with proper error handling for the calls > > > to rockchip_pmu_set_idle_request(). > > > > If you'd like, I can base my v2 on this patch series with the changes > > I'm suggesting? > > I leave you guys to decide the best way forward, but please keep in > mind that fixes/stable patches are easier managed if they are as > simple as possible and without relying on cleanup patches. Better fix > the problem first, then clean up the code. I had this ordered the other way around initially and as Heiko pointed out that makes things more complicated overall: https://lore.kernel.org/linux-rockchip/4864529.A9s0UXYOmP@diego/ Greetings, -- Sebastian
Attachment:
signature.asc
Description: PGP signature