On Wed, Nov 18, 2020 at 5:42 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote: > > On 17-11-20, 14:06, Rafael J. Wysocki wrote: > > Is this really a cpufreq thing, though, or is it arch stuff? I think > > the latter, because it is not necessary for anything in cpufreq. > > > > Yes, acpi-cpufreq happens to know this information, because it uses > > processor_perflib, but the latter may as well be used by the arch > > enumeration of CPUs and the freqdomain_cpus mask may be populated from > > there. > > > > As far as cpufreq is concerned, if the interface to the hardware is > > per-CPU, there is one CPU per policy and cpufreq has no business > > knowing anything about the underlying hardware coordination. > > It won't be used by cpufreq for now at least and yes I understand your > concern. I opted for this because we already have a cpufreq > implementation for the same thing and it is usually better to reuse > this kind of stuff instead of inventing it over. Do you mean related_cpus and real_cpus? That's the granularity of the interface to the hardware I'm talking about. Strictly speaking, it means "these CPUs share a HW interface for perf control" and it need not mean "these CPUs are in the same clock/voltage domain". Specifically, it need not mean "these CPUs are the only CPUs in the given clock/voltage domain". That's what it means when the control is exercised by manipulating OPPs directly, but not in general. In the ACPI case, for example, what the firmware tells you need not reflect the HW topology in principle. It only tells you whether or not it wants you to coordinate a given group of CPUs and in what way, but this may not be the whole picture from the HW perspective. If you need the latter, you need more information in general (at least you need to assume that what the firmware tells you actually does reflect the HW topology on the given SoC). So yes, in the particular case of OPP-based perf control, cpufreq happens to have the same information that is needed by the other subsystems, but otherwise it may not and what I'm saying is that it generally is a mistake to expect cpufreq to have that information or to be able to obtain it without the help of the arch/platform code. Hence, it would be a mistake to design an interface based on that expectation. Or looking at it from a different angle, today a cpufreq driver is only required to specify the granularity of the HW interface for perf control via related_cpus. It is not required to obtain extra information beyond that. If a new mask to be populated by it is added, the driver may need to do more work which is not necessary from the perf control perspective. That doesn't look particularly clean to me. Moreover, adding such a mask to cpufreq_policy would make the users of it depend on cpufreq sort of artificially, which need not be useful even. IMO, the information needed by all of the subsystems in question should be obtained and made available at the arch/platform level and everyone who needs it should be able to access it from there, including the cpufreq driver for the given platform if that's what it needs to do. BTW, cpuidle may need the information in question too, so why should it be provided via cpufreq rather than via cpuidle?