Re: [PATCH 2/2] platform/x86: hp-wmi: Add thermal profile support for 8BAD boards

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 03, 2024 at 11:21:24AM +1100, Prajna Sariputra wrote:
> On Sunday, 31 December 2023 9:46:25 PM AEDT Alexis Belmonte wrote:
> > Add 8BAD to the list of boards which have thermal profile selection
> > available. This allows the CPU to draw more power than the default TDP
> > barrier defined by the 'balanced' thermal profile (around 50W), hence
> > allowing it to perform better without being throttled by the embedded
> > controller (around 130W).
> > 
> > We first need to set the HP_OMEN_EC_THERMAL_PROFILE_TIMER_OFFSET to zero.
> > This prevents the timer countdown from reaching zero, making the embedded
> > controller "force-switch" the system's thermal profile back to 'balanced'
> > automatically.
> > 
> > We also need to put a number of specific flags in
> > HP_OMEN_EC_THERMAL_PROFILE_FLAGS_OFFSET when we're switching to another
> > thermal profile:
> > 
> >    - for 'performance', we need to set both HP_OMEN_EC_FLAGS_TURBO and
> >      HP_OMEN_EC_FLAGS_NOTIMER;
> > 
> >    - for 'balanced' and 'powersave', we clear out the register to notify
> >      the system that we want to lower the TDP barrier as soon as possible.
> 
> Do you know if there's a way to check that a given model has this specific timer,
> other than just testing the patch?

I haven't been able to figure out so yet -- there's a 'device_list.json'
file (IIRC) defined somewhere in the Omen Control Center app which I came
across, but no simple way of universally checking if this behavior is
active :[

I think I remember that I've seen another model ID near mine being defined,
so I think I *could* add it directly to both lists though, so that's
that.

> I have an Omen 16-n0000 (8A42), which has a Ryzen 7 6800H and a Radeon
> RX 6650M, and I've been patching the kernel to add it to the omen_thermal_profile_boards
> array for a while now. Just doing that prevents the worst of the throttling from
> happening (GPU dropping from 105W to 35W and the CPU being stuck at like 2GHz
> or less), but currently the GPU still drops to 75W eventually. Switching to
> performance does make it go back to 105W (and even 120W for a bit) before it
> goes back down to 75W, so it makes me wonder if there is actually a timer on my
> model that's doing it rather than just thermal throttling.

I think you've answered it yourself with your other mail ;]

> 
> > 
> > The third flag defined in the hp_thermal_profile_omen_flags enum,
> > HP_OMEN_EC_FLAGS_JUSTSET, is present for completeness.
> > 
> > To prevent potential behaviour breakage with other Omen models, a
> > separate omen_timed_thermal_profile_boards array has been added to list
> > which boards expose this behaviour.
> > 
> > Performance benchmarking was done with the help of silver.urih.com and
> > Google Chrome 120.0.6099.129, on Gnome 45.2, with the 'performance'
> > thermal profile set:
> > 
> > |                  | Performance |     Stress |   TDP |
> > |------------------|-------------|------------|-------|
> > |    with my patch |      P84549 |    S0.1891 |  131W | 
> > | without my patch |      P44084 |    S0.1359 |   47W |
> > 
> > The TDP measurements were done with the help of the s-tui utility,
> > during the load.
> > 
> > There is still work to be done:
> > 
> >    - tune the CPU and GPU fans to better cool down and enhance
> >      performance at the right time; right now, it seems that the fans are
> >      not properly reacting to thermal/performance events, which in turn
> >      either causes thermal throttling OR makes the fans spin way too long,
> >      even though the temperatures have lowered down
> 
> Yeah, that's also a problem with my model, where with a heavy CPU only workload
> the CPU would boost high and almost immediately run into thermal throttling and
> stays throttled for like a few minutes before the fans ramp up even a little,
> which is why I'm not sure that adding my model to the list upstream would be a
> good idea. My CPU doesn't seem to boost all that high though, I don't remember
> the performance mode making much of a difference the last time I tested it.

I totally agree with you -- I just wanted to make sure that my patch was
conform enough with the rest of the codebae before making further progress :]

> Also, for what it's worth I had a conversation with one of the folks who wrote
> the platform profile code (Enver Balalic) a few months ago, and they said the
> profiles are just fan speed control modes on their Omen model.

I've made some reverse engineering on the Omen Control Center app through a
Windows VM, and I came across a few WMI calls in a class (IIRC `HpaClient`, or
something similar to that name) that do reads to this fan curve.

I haven't yet found the parts that do writes unfortunately, that also
needs to be browsed through. :[

The ACPI table that handles those WMI "methods" is the `SSDT` one --
I've disassembled it with `iasl`, which really helped figuring out the
expected data structures.

There's also a post on Reddit which talks about this feature; since this
was posted 2 years ago, I'd say that at least *some* models support
this -- but maybe I'm just misinterpreting it?

https://www.reddit.com/r/HPOmen/comments/poxe2i/new_hp_omen_update_adds_option_for_manual_fan/

> I ended up just testing the patch for myself (after adding my model number to
> the arrays), and it does improve the GPU performance further for me, instead
> of the GPU dropping to 75W after 2-4 minutes it is now able to maintain at least
> 100W even after 10 minutes (tested with Quake 2 RTX). So, it seems like the timer
> thing also applies to my Omen 16-n0000 (8A42). That performance also roughly
> matches up with how notebookcheck.net says my Omen performs in their review
> (103W performance, 72W balanced), so it'd be great if you can also include my
> model in your patch.

Glad to hear that it helps! Your model will be part of both lists next time I
send my updated patches for a "refreshed" review :]

> I just ran more tests with the CPU performance, and it seems that I might have
> misremembered how bad the fan curve was, since I have been limiting the CPU
> frequency to 4GHz instead of letting the CPU do its thing by itself (max boost is
> 4.7GHz), if I go back to the latter then with a heavily multithreaded workload
> (like compiling the kernel) the fans ramp up high within a few seconds of the
> CPU reaching 100C on its hottest core, and the CPU seems to maintain that
> temperature (or the sensors just don't report values above 100C, not sure). That
> seems worrying given that the supposed max operating temperature for the CPU
> (Ryzen 7 6800H) is 95C, but then again that probably won't be the case when gaming,
> which is the main use case for these laptops anyway.

This is definitely problematic yet also what I kind of experience, even
though we both have a completely different combination of CPUs and GPUs
-- at least we can say that the "draft" patches work regardless of the
backed hardware, which is good to hear :]

Thanks for testing out my patch -- as I've said to Ilpo, I won't be able to do
much progress for a few weeks BUT I'm still on it as soon as I'm available
again!

Alexis




[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux