Re: [Patch 0/5] Support Autonomous Selection mode in cppc_cpufreq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 






There seems to be some quite fundamental disagreement on how this
should be done, so I'm afraid I cannot do much about it ATM.

Please agree on a common approach and come back to me when you are ready.

Sending two concurrent patchsets under confusingly similar names again
and again isn't particularly helpful.

Thanks!

Hi Rafael,

Thank you for looking into this.

Hi Lifeng,

As per the discussion, we can make the driver future extensible and
also can optimize the register read/write access.

I gave some thought and below is my proposal.

1) Pick 'Patch 1-7' from your patch series [1] which optimize API's
    to read/write a cpc register.

2) Pick my patches in [2]:
    - Patch 1-4: Keep all cpc registers together under acpi_cppc sysfs.
                 Also, update existing API's to read/write regs in batch.
    - Patch 5: Creates 'cppc_cpufreq_epp_driver' instance for booting
      all CPU's in Auto mode and set registers with right values.
      They can be updated after boot from sysfs to change hints to HW.
      I can use the optimized API's from [1] where required in [2].

Let me know if you are okay with this proposal.
I can also send an updated patch series with all the patches combined?

[1] https://lore.kernel.org/all/20250206131428.3261578-1-zhenglifeng1@xxxxxxxxxx/
[2] https://lore.kernel.org/lkml/20250211103737.447704-1-sumitg@xxxxxxxxxx/

Regards,
Sumit Gupta


Hi Sumit,

Over the past few days, I've been thinking about your proposal and
scenario.

I think we both agree that PATCH 1-7 in [1] doesn't conflicts with [2], so
the rest of the discussion focuses on the differences between [2] and the
PATCH 8 in [1].

We both tried to support autonomous selection mode in cppc_cpufreq but on
different ways. I think the differences between these two approaches can be
summarized into three questions:

1. Which sysfs files to expose? I think this is not a problem, we can keep
all of them.

2. Where to expose these sysfs files? I understand your willing to keep all
cpc registers together under acpi_cppc sysfs. But in my opinion, it is more
suitable to expose them under cppc_cpufreq_attr, for these reasons:

   1) It may probably introduce concurrency and data consistency issues, as
I mentioned before.


As explained in previous reply, this will be solved with the ifdef
check to enable the attributes for only those CPUFREQ drivers which want
to use the generic nodes.
 e.g: '#ifdef CONFIG_ACPI_CPPC_CPUFREQ' for 'cppc_cpufreq'.
			
These CPC register read/write sysfs nodes are generic as per the ACPI
specification and without any vendor specific logic.
So, adding them in the lib file 'cppc_acpi.c'(CONFIG_ACPI_CPPC_LIB) will
avoid code duplication if a different or new ACPI based CPUFREQ driver
also wants to use them just by adding their macro check. Such ifdef check is also used in other places for attributes creation like below.
So, don't look like a problem.
 $ grep -A4 "acpi_cpufreq_attr\[" drivers/cpufreq/acpi-cpufreq.c
 static struct freq_attr *acpi_cpufreq_attr[] = {
	&freqdomain_cpus,
 #ifdef CONFIG_X86_ACPI_CPUFREQ_CPB
	&cpb,
 #endif

   2) The store functions call cpufreq_cpu_get() to get policy and update
the driver_data which is a cppc_cpudata. Only the driver_data in
cppc_cpufreq's policy is a cppc_cpudata! These operations are inappropriate
in cppc_acpi. This file currently provides interfaces for cpufreq drivers
to use. Reverse calls might mess up call relationships, break code
structures, and cause problems that are hard to pinpoint the root cause!


If we don't want to update the cpufreq policy from 'cppc_acpi.c' and only update it from within the cpufreq, then this could be one valid
point to not add the write syfs nodes in 'cppc_acpi.c' lib file.

@Rafael, @Viresh : Do you have any comments on this?

   3) Difficult to extend. Different cpufreq drivers may have different
processing logic when reading from and writing to these CPC registers.
Limiting all sysfs here makes it difficult for each cpufreq driver to
extend. I think this is why there are only read-only interfaces under
cppc_attrs before.


We are updating the CPC registers as per the generic ACPI specification.
So, any ACPI based CPUFREQ driver can use these generic nodes to
read/write reg's until they have a vendor specific requirement or
implementation.
As explained above, If someone wants to update in different way and use
their own CPUFREQ driver then these generic attributes won't be created
due to the CPUFREQ driver macro check.
I think AMD and Intel are doing more than just reading/updating the registers. That's why they needed their driver specific implementations.

Adding a 'ifdef' is not a good way to solve these problems. Defining this
config does not necessarily mean that the cpufreq driver is cppc_cpufreq.


It means that only.
./drivers/cpufreq/Makefile:obj-$(CONFIG_ACPI_CPPC_CPUFREQ) += cppc_cpufreq.o

3. Is it necessary to add a new driver instance? [1] exposed the sysfs
files to support users dynamically change the auto selection mode of each
policy. Each policy can be operated seperately. It seems to me that if you
want to boot all CPUs in auto mode, it should be sufficient to set all
relevant registers to the correct values at boot time. I can't see why the
new instance is necessary unless you explain it further. Could you explain
more about why you add a new instance starting from answer these questions:

For a specific CPU, what is the difference between using the two instances
when auto_sel is 1? And what is the difference when auto_sel is 0?


Explained this in previous reply. Let me elaborate more.

For hundred's of CPU's, we don't need to explicitly set multiple sysfs
after boot to enable and configure Auto mode with right params. That's why an easy option is to pass boot argument or module param for enabling
and configuration.
A separate instance 'cppc_cpufreq_epp' of the 'cppc_cpufreq' driver is
added because policy min/max need to be updated to the min/max_perf
and not nominal/lowest nonlinear perf which is done by the default
init hook. Min_perf value can be lower than lowest nonlinear perf and Max_perf can be higher than nominal perf. If some CPU is booted with epp instance and later the auto mode is disabled or min/max_perf is changed from sysfs then also the policy
min/max need to be updated accordingly.

Another is that in Autonomous mode the freq selection and setting is
done by HW. So, cpufreq_driver->target() hook is not needed.
These are few reasons which I am aware of as of now.
I think in future there can be more. Having a separate instance
reflecting a HW based Autonomous frequency selection will make it easy
for any future changes.

If it turns out that the new instance is necessary, I think we can reach a
common approach by adding this new cpufreq driver instance and place the
attributes in 'cppc_cpufreq_epp_attr', like amd-pstate did.

What do you think?

I initially thought about this but there was a problem.
What if we boot with non-epp instance which doesn't have these attributes and later want to enable Auto mode for few CPU's from sysfs.


Best Regards,
Sumit Gupta




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux