[RFC PATCH 0/2] cpufreq_ext: Introduce cpufreq ext governor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

I am currently working on a patch for a CPU frequency governor based on
BPF, which can use BPF to customize and implement various frequency
scaling strategies.

If you have any feedback or suggestions, please do let me know.

Motivation
----------

1. Customization

Existing cpufreq governors in the kernel are designed for general
scenarios, which may not always be optimal for specific or specialized
workloads.

The userspace governor allows direct control over cpufreq, but users
often require guidance from the kernel to achieve the desired frequency.

Cpufreq_ext aims to address this by providing a customizable framework that
can be tailored to the unique needs of different systems and applications.

While cpufreq governors can be implemented within a kernel module,
maintaining a ko tailored for specific scenarios can be challenging.
The complexity and overhead associated with kernel modules make it
difficult to quickly adapt and deploy custom frequency scaling strategies.

Cpufreq_ext leverages BPF to offer a more lightweight and flexible approach
to implementing customized strategies, allowing for easier maintenance and
deployment.

2. Integration with sched_ext:

sched_ext is a scheduler class whose behavior can be defined by a set of
BPF programs - the BPF scheduler.

Look for more about sched_ext in [1]:

	[1] https://www.kernel.org/doc/html/next/scheduler/sched-ext.html

The interaction between CPU frequency scaling and task scheduling is
critical for performance.

cpufreq_ext can work with sched_ext to ensure that both scheduling
decisions and frequency adjustments are made in a coordinated manner,
optimizing system responsiveness and power consumption.

Overview
--------

The cpufreq ext is a BPF based cpufreq governor, we can customize
cpufreq governor in BPF program.

CPUFreq ext works as common cpufreq governor with cpufreq policy.

		   --------------------------
		  |        BPF governor      |
		   --------------------------
			       |
			       v
			  BPF Register
			       |
			       v
	    --------------------------------------
	   |             CPUFreq ext              |
	    --------------------------------------
	      ^                ^               ^
	      |                |               |
	   ---------       ---------       ---------
	  | policy0 | ... | policy1 | ... | policyn |
	   ---------       ---------       ---------

We can register serval function hooks to cpufreq ext by BPF Struct OPS.

The first patch define a dbs_governor, and it's works like other
governor.

The second patch gives a sample how to use it, implement one
typical cpufreq governor, switch to max cpufreq when VIP task
is running on target cpu.

Detail
------

The cpufreq ext use bpf_struct_ops to register serval function hooks.

	struct cpufreq_governor_ext_ops {
		...
	}

Cpufreq_governor_ext_ops defines all the functions that BPF programs can
implement customly.

If you need to add a custom function, you only need to define it in this
struct.

At the moment we have defined the basic functions.

1. unsigned long (*get_next_freq)(struct cpufreq_policy *policy)

	Make decision how to adjust cpufreq here.
	The return value represents the CPU frequency that will be
	updated.

2. unsigned int (*get_sampling_rate)(struct cpufreq_policy *policy)

	Make decision how to adjust sampling_rate here.
	The return value represents the governor samplint rate that
	will be updated.

3. unsigned int (*init)(void)

	BPF governor init callback, return 0 means success.

4. void (*exit)(void)

	BPF governor exit callback.

5. char name[CPUFREQ_EXT_NAME_LEN]

	BPF governor name.

The cpufreq_ext also add sysfs interface which refer to governor status.

1. ext/stat attribute:

	Access to current BPF governor status.

	# cat /sys/devices/system/cpu/cpufreq/ext/stat
	Stat: CPUFREQ_EXT_INIT
	BPF governor: performance

There are number of constraints on the cpufreq_ext:

1. Only one ext governor can be registered at a time.

2. By default, it operates as a performance governor when no BPF
   governor is registered.

3. The cpufreq_ext governor must be selected before loading a BPF
   governor; otherwise, the installation of the BPF governor will fail.

TODO
----

The current patch is a starting point, and future work will focus on
expanding its capabilities.

I plan to leverage the BPF ecosystem to introduce innovative features,
such as real-time adjustments and optimizations based on system-wide
observations and analytics.

And I am looking forward to any insights, critiques, or suggestions you
may have.

Yipeng Zou (2):
  cpufreq_ext: Introduce cpufreq ext governor
  cpufreq_ext: Add bpf sample

 drivers/cpufreq/Kconfig        |  23 ++
 drivers/cpufreq/Makefile       |   1 +
 drivers/cpufreq/cpufreq_ext.c  | 525 +++++++++++++++++++++++++++++++++
 samples/bpf/.gitignore         |   1 +
 samples/bpf/Makefile           |   8 +-
 samples/bpf/cpufreq_ext.bpf.c  | 113 +++++++
 samples/bpf/cpufreq_ext_user.c |  48 +++
 7 files changed, 718 insertions(+), 1 deletion(-)
 create mode 100644 drivers/cpufreq/cpufreq_ext.c
 create mode 100644 samples/bpf/cpufreq_ext.bpf.c
 create mode 100644 samples/bpf/cpufreq_ext_user.c

-- 
2.34.1





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux