Re: [External] Re: RFC: offering a standardized (/sys/class) userspace API for selecting system/laptop performance-profiles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,

Thank you very much for your feedback, much appreciated.

On 9/24/20 4:10 AM, Mark Pearson wrote:


On 2020-09-22 6:43 a.m., Hans de Goede wrote:
Hi,

On 9/21/20 11:03 AM, Elia Devito wrote:
Hi all, sorry for response delay I'm very busy at work this period

No problem.

A common interface is surely the best solution, especially because it allows
to standardize the user-space tools and maybe to integrate its with desktop,
like Bastien is doing with gnome-power-profiles-daemon or like the similar tool
plasma-pstate.

I think we should keep separate performance and thermal profiles thus leaving
the possibility of setting a thermal profile independently of the performance
profile and vice versa.

Hp implements up to 4 thermal profiles (apparently the same ones that implement
dell), my patch implements the first 3 profiles which are the ones supported by
my hardware.

1. HP Recommended -> fan stay off and start at low~medium speed when necessary
2. Performance    -> fan stay off and start at medium~hight speed when
necessary
3. Cool           -> fan stay off and start at medium~hight speed when
necessary
4. Quiet          -> fan should stay off and start at very low speed if
necessary

for each profile the firmware set also a OEM variable to select DPTF profile
with the adeguate power limit.

combining these profiles with the performance profiles it is possible to obtain
the desired performance according to the needs of the moment

e.g.

For gaming purpose when the CPU and GPU share the thermal budget, in this case
the best solution is to set thermal profile to performance to maximize the heat
dissipation and the p-state profile to powersave, in this way during loadings
the cpu gain a performance boost that allow to reduce loading time, instead,
during gameplay the cpu performance will be limited in favor of the GPU
allowing the maximum framerate to be reached.
(feral had to handle it for its gamemode tool:
https://github.com/FeralInteractive/gamemode/pull/179)

Another opposed particular case could be thermal profile set to quiet and
p-state set to performance, usefull for example to maximizze cpu performance
in silent ambient room like a library, obviously for CPU-only intesive tasks
the best solution is to set either thermal and performance profile to
performance.

Basically there are infinite combinations that can be made to obtain the best
configuration for each situation, to allow this a common interface should offer
a possibility to:

- Define the list of thermal profiles separately from the performance ones
- Eventually define a list of on/off attributes (useful for lenovo lap_mode?)
- Provide a description of them
- Switching between thermal profiles regardless of the performance profile

A possible solution could be a "slider like" interface for performance level
and a list of thermal profile.

So I have been thinking about this and performance level and thermal profile
are really inherently couple to each other. Telling the CPU it can use
25W TPD instead of the default 15W, without also ramping up the cooling is
just going to lead to a whole bunch of thermal throttling.

In a desktop machine with a discrete GPU it is sorta easy, in essence you have
a GPU performance profile, controlling GPU TPD/turbo behavior and the GPU
fans too match, and a CPU performance profile which likewise controls
the CPU fan profile too match the CPU performance profile.

With laptops with a discrete GPU things become harder because there is a
single shared cooling mechanism. But there you could simply say that
performance-profile = max(gpu-profile, cpu-profile).

I mean telling the GPU and CPU that they can burn a gazillion watts and
then telling the cooling setup to be as quiet as possible, is clearly
not going to end well.

This all assumes that we have some nice way to tell the hardware about
the 3 separate (gpu / cpu / cooling) profiles we want.

But that is not always the case; and often when using a CPU with
integrated GPU they are all tied together.

So my proposal is to have a :

/sys/class/performance-profile

Underneath we can have one or more entries (performance-profile providers)
each one with a performance_level file on the previously suggested 0-100 scale
and a performance_mappings file listing the supported discrete values on that
scale and some descriptions of those discrete values purely for informational
purposes.

Besides the performance_level and performance_mappings files I would also like
to add a "type" sysfs attribute, which can have 1 of 3 values:
"system", "cpu" and "gpu".

So something like the thinkpad_acpi performance levels will be "system", and
the intel_pstate driver could maybe also register itself here as a "cpu"
type performance-profile provider.

This will allow userspace (if / when it wants to) to do things like put the
CPU in medium/balanced mode while telling the GPU to go full-throttle for when
playing a game which is clearly GPU limited.

This game scenario of course assumes that we then actually will have a
performance-profile driver for both the CPU and the GPU.

Note I'm still just brainstorming here, but I think that having the
0-100 scale + the type thing should cover all the use-cases we want
to cover.

As always feedback or alternative API proposals are very much welcome.

I was mulling this over and I think the 1 to 100 slider seems awkward - too many levels and the fine graduation mapping onto somewhat coarse (3, 4 or 5) options doesn't seem to fit well for me. I'm struggling to see how it fits the tweaking CPU vs GPU vs system - do we end up introducing multiple sliders - which gets complicated with all the choices? I'm just not convinced yet.

So one thing to keep in mind here is that this is the kernel <-> userspace
API. It certainly is not the intention for a UI to show this 0-100 scale
to end-users. The most raw version of a UI should display the number
of actually supported levels given by the mappings file, preferably with
the labels from the mappings file visible.

So the idea behind the 0-100 scale, is that lets sat that
we have 2 laptops, which each 4 profiles:

1. Game laptop:    Low power, Balanced, Performance, Ultra Performance
2. Ultra-portable: Ultra Low Power, Low Power, Balanced, Performance

And we have a UI which offers a balanced option, then to configure
the balanced option on laptop 1. we would need to select profile 1 (of 0-3)
and on laptop 2. we would need to select profile 2 (of 0-3)

The idea behind the 0-100 scale is that in that case both would map
there balanced profile to 50 and userspace can just write 50.

Note I'm not 100% sold on this idea myself yet.

I'm also not sure about being able to tweak everything too - Linux users are smart but is it just getting irritating at that point? Power (hah!) users have plenty of tweaking tools at their disposal if they want to get into change p-states and tweak every little power option; but this exercise is aimed at those who want to make simple general changes - increase the power because they have a build to complete, drop into quiet mode when watching a movie, or a low power setting when they know they're not going to be able to charge for a long time. I suspect some smart person will end up automating all this but that's by the by.

Well for power-users to be able to tweak, the kernel API
needs to expose all the support profiles. Before the 0-100
idea, the idea for the user space API was to use strings
and just support "low-power", "balanced" and "performance" +
optional extra profiles where the extra profiles where
sort of a free-for-all wrt names, etc.

So as a counter proposal: Have the slider choose between some key modes that we think cover the use cases that people would *want*.

I'd keep the mode list small to avoid it being confusing - more could be added later if really needed but I'm guessing (and happy to be corrected) that the majority of activities would fit into just a few boxes reasonably well.

For instance (I think I get the following from the above):
  - low power/Cool
  - Quiet
  - balance/default
  - CPU performance
  - GPU performance
>
> The vendors would just choose their optimal firmware setting for that use case depending on what they have available.

From a UI perspective this might very well work, but chances are
that for GPU performance we would need to:

a) Tell the EC to go for maximum cooling / aggressive ramp up the fans
b) Tell the CPU (e.g. pstate, TPD) to go for a balanced profile
c) Tell the GPU, (e.g. the nouveau or radeon drivers) to aggressively
turbo and maybe if it has a configurable TPD give it some more TPD

And we very likely need to talk to multiple drivers for this,
at least some firmware driver and the GPU driver.

Which is where my type attribute idea comes from. I know that for this
discussion started with controlling firmware (DPTF oem vars / EC)
performance-profile settings, but at some point we probably also need
to e.g. bring in controlling GPU turbo performance profiles.

Apart from the GPU performance option that maps pretty well for Lenovo and I think it maps very nicely for HP. I wanted to squish low power ad quiet together but that seemed unfair - they are very similar though :)
I'm struggling to think of other use cases that would really matter...thoughts? Is this *too* simple?

Well this looks a lot like what we started with and I think it is also
closer to what Bastien would prefer as an userspace API for this.

So I guess that would bring us full circle to the initial proposal
where we have a bunch of strings describing profiles, but too avoid
a free-for-all we pre-define a bunch of standard profile names and
we say in the specification that where-ever possible drivers should
map there avaivable profiles to the pre-defined profile names ?

Bastien, have I understood you right that having a list of
pre-defined profile-names and then using strings for the sysfs
API would be your preferred API for this ?

I do still think we should probably add a "type" attribute, which
for thinkpad_acpi and for hp-wmi would probably just be hardcoded
to "system".

I think that having the notion of a separate performance profile for at
least the GPU makes sense, not because of that is the UI which we want
to show to end users, but because the GPU's turbo behavior will likely
be controlled by a completely different driver. AFAIK currently different
GPU drivers have different knobs for this, preparing the
performance-profile API to allow the drivers to offer a unified
(and probably dumbed down a bit) API for this is something which we
should at least keep in mind.

I guess that for GPU drivers we could also have a "custom"
performance-profile, so when users use advanced vendor/driver specific
tools to tweak the GPU performance, then reading back the currently
selected profile would report custom; and writing one of the default
profiles would override the users/tools tweaks and go back to the
settings belonging to that profile.

Anyways adding a "type" sysfs attribute will allow us to also use
the performance-profile API for e.g. GPUs later, we can also even
define only 1 valid value for it for now ("system"). This way we
can document that tools interacting with the API should at least
check the type and skip performance-profiles with a type they do
not know.

Bastien, any comments on the concept of having a type sysfs attribute?

Regards,

Hans




[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux