10.08.2021 13:51, Ulf Hansson пишет: ... >> + if (power_on) { >> + ret = dev_pm_genpd_set_performance_state(dev, default_pstate); > > However, this is more questionable to me. > > First, I don't think we should care about whether this is "power_on" > or not. At this point, performance states are treated orthogonal to > idle states in genpd. We may decide to change that in some way, but > that deserves a different change. Okay > Second, I don't think we should call > dev_pm_genpd_set_performance_state() from here. It's probably better > handled from the genpd callback itself, if/when needed. Indeed > That said, perhaps the new callback should just return a regular error > code and zero on success, rather than the current performance state. > See more below. > >> + if (ret) { >> + dev_err(dev, "failed to set default performance state %u for PM domain %s: %d\n", >> + default_pstate, pd->name, ret); >> + goto out; >> + } >> + } else { >> + dev_gpd_data(dev)->rpm_pstate = default_pstate; > > No, this isn't the right thing to do. > > It looks like you are trying to use the ->rpm_pstate for > synchronization with runtime PM for consumer drivers. This is fragile > as it depends on the runtime PM deployment in the consumer driver. I > think you should look at ->rpm_pstate as a variable solely for > managing save/restore of the performance state for the device, during > runtime suspend/resume in genpd. > > Synchronization of a vote for a performance state for a device, needs > to be managed by calling dev_pm_genpd_set_performance_state() - or by > calling an OPP function that calls it, like dev_pm_opp_set_rate(), for > example. The !power_on case should be skipped at all because common core code can't handle it properly, hence rpm_pstate shouldn't be touched. I realized it only after sending out this email. ... >>> Rather than doing the OF parsing above to find out the current state >>> for the device, why can't you just call dev_pm_opp_set_rate() to >>> initialize a proper vote instead? >>> >> >> For some devices clock rate is either preset by bootloader, or by clk driver, or by assigned-clocks in a device-tree. And then I don't see what's the difference in comparison to initialization for the current rate. >> >> For some devices, like memory controller, we can't just change the clock rate because it's a complex procedure and some boards will use fixed rate, but the power vote still must be initialized. > > I am not saying you should change the clock rate. The current code > path that runs via devm_tegra_core_dev_init_opp_table() just calls > clk_get_rate and then dev_pm_opp_set_rate() with the current rate to > vote for the corresponding OPP level. Right? > > Isn't this exactly what you want? No? I see now what you meant, it's actually a simpler variant and it works too. Thank you for the suggestion, I'll prepare v8.