Hi Viresh, On 2019-02-08 10:23, Viresh Kumar wrote: > On 08-02-19, 10:15, Marek Szyprowski wrote: >> On 2019-02-08 09:55, Viresh Kumar wrote: >>> On 08-02-19, 09:12, Marek Szyprowski wrote: >>>> On 2019-02-08 07:49, Viresh Kumar wrote: >>>>> Why don't you get similar problem during suspend? I think you can get >>>>> it when the CPUs are offlined as I2C would have gone by then. The >>>>> cpufreq or OPP core can try and run some regulator or genpd or clk >>>>> calls to disable resources, etc. Even if doesn't happen, it certainly >>>>> can. >>>> CPUfreq is suspended very early during system suspend and thus it does >>>> nothing when CPUs are being offlined. >>>>> Also at resume the cpufreq core may try to change the frequency right >>>>> from ->init() on certain cases, though not everytime and so the >>>>> problem can come despite of this series. >>>> cpufreq is still in suspended state (it is being 'resume' very late in >>>> the system resume procedure), so if driver doesn't explicitly change any >>>> opp in ->init(), then cpufreq core waits until everything is resumed. To >>>> sum up, this seems to be fine, beside the issue with regulator >>>> initialization I've addressed in this patchset. >>> Yeah, the governors are suspended very soon, but any frequency change >>> starting from cpufreq core can still happen. There are at least two >>> points in cpufreq_online() where we may end up changing the frequency, >>> but that is conditional and may not be getting hit. >> Then probably cpufreq core suspend should handle this. > Handle what ? That code is part of cpufreq_online() and needs to be > there only. If got it right, it is a matter of handling CPUFREQ_NEED_INITIAL_FREQ_CHECK flag. Maybe there should be additional check if CPUfreq is not suspended? >>>>> We guarantee that the resources are available during probe but not >>>>> during resume, that's where the problem is. >>>> Yes, so I've changed cpufreq-dt to the common approach, in which the >>>> driver keeps all needed resources for the whole lifetime of the device. >>> That's not what I was saying actually. I was saying that it should be >>> fine to do a I2C transfer during resume, else we will always have >>> problems and have to fix them with hacks like the one you proposed >>> where you acquire resources for all the possible CPUs. Maybe we can >>> fix it once and for all. >> It is fine to do i2c transfers during cpufreq resume (see > By resume I meant system resume and the whole onlining process of > non-boot CPUs. Right now those are 2 separate things in cpufreq core. >> drivers/base/power/main.c dpm_resume() function for exact call place). >> The problem is that such calls are not allowed in ->init(), which might >> be called very early from CPU hotplug path (CPUs are resumed in the >> first step of system resume procedure). > Right and that's where I think we can do something to fix it in a > proper way. > >> What's wrong with my proposed fix? It is not that uncommon to gather all >> resources in probe() and keep them until remove() happens. > For cpufreq drivers, we must be doing most of the stuff in init/exit > only as far as possible. I am not saying your patch is bad, that is > the best we can do in such situations. But I don't like that we have > to get the resources for all CPUs, despite the fact that many of them > would be part of the same policy and hence share resources. The > problem was that we need to get sharing-cpus detail as well in probe > then, etc. Both resources in this case: clocks and regulators are refcounted by their frameworks, so the cost of getting a few more references for them is imho negligible. > I was thinking about doing disable_nonboot_cpus() much earlier as the > userspace is already frozen by then. > > @Rafael: Will that slowdown the suspend/resume process? Maybe not as > we are doing everything from a single CPU/thread anyways ? For some reasons drivers are handled partially asynchronously in suspend/resume procedure and can do suspend and resume in parallel. I don't think that limiting the whole suspend/resume process to a single cpu is the best we can do. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland