On Wed, Jan 03, 2024 at 11:21:24AM +1100, Prajna Sariputra wrote: > On Sunday, 31 December 2023 9:46:25 PM AEDT Alexis Belmonte wrote: > > Add 8BAD to the list of boards which have thermal profile selection > > available. This allows the CPU to draw more power than the default TDP > > barrier defined by the 'balanced' thermal profile (around 50W), hence > > allowing it to perform better without being throttled by the embedded > > controller (around 130W). > > > > We first need to set the HP_OMEN_EC_THERMAL_PROFILE_TIMER_OFFSET to zero. > > This prevents the timer countdown from reaching zero, making the embedded > > controller "force-switch" the system's thermal profile back to 'balanced' > > automatically. > > > > We also need to put a number of specific flags in > > HP_OMEN_EC_THERMAL_PROFILE_FLAGS_OFFSET when we're switching to another > > thermal profile: > > > > - for 'performance', we need to set both HP_OMEN_EC_FLAGS_TURBO and > > HP_OMEN_EC_FLAGS_NOTIMER; > > > > - for 'balanced' and 'powersave', we clear out the register to notify > > the system that we want to lower the TDP barrier as soon as possible. > > Do you know if there's a way to check that a given model has this specific timer, > other than just testing the patch? I haven't been able to figure out so yet -- there's a 'device_list.json' file (IIRC) defined somewhere in the Omen Control Center app which I came across, but no simple way of universally checking if this behavior is active :[ I think I remember that I've seen another model ID near mine being defined, so I think I *could* add it directly to both lists though, so that's that. > I have an Omen 16-n0000 (8A42), which has a Ryzen 7 6800H and a Radeon > RX 6650M, and I've been patching the kernel to add it to the omen_thermal_profile_boards > array for a while now. Just doing that prevents the worst of the throttling from > happening (GPU dropping from 105W to 35W and the CPU being stuck at like 2GHz > or less), but currently the GPU still drops to 75W eventually. Switching to > performance does make it go back to 105W (and even 120W for a bit) before it > goes back down to 75W, so it makes me wonder if there is actually a timer on my > model that's doing it rather than just thermal throttling. I think you've answered it yourself with your other mail ;] > > > > > The third flag defined in the hp_thermal_profile_omen_flags enum, > > HP_OMEN_EC_FLAGS_JUSTSET, is present for completeness. > > > > To prevent potential behaviour breakage with other Omen models, a > > separate omen_timed_thermal_profile_boards array has been added to list > > which boards expose this behaviour. > > > > Performance benchmarking was done with the help of silver.urih.com and > > Google Chrome 120.0.6099.129, on Gnome 45.2, with the 'performance' > > thermal profile set: > > > > | | Performance | Stress | TDP | > > |------------------|-------------|------------|-------| > > | with my patch | P84549 | S0.1891 | 131W | > > | without my patch | P44084 | S0.1359 | 47W | > > > > The TDP measurements were done with the help of the s-tui utility, > > during the load. > > > > There is still work to be done: > > > > - tune the CPU and GPU fans to better cool down and enhance > > performance at the right time; right now, it seems that the fans are > > not properly reacting to thermal/performance events, which in turn > > either causes thermal throttling OR makes the fans spin way too long, > > even though the temperatures have lowered down > > Yeah, that's also a problem with my model, where with a heavy CPU only workload > the CPU would boost high and almost immediately run into thermal throttling and > stays throttled for like a few minutes before the fans ramp up even a little, > which is why I'm not sure that adding my model to the list upstream would be a > good idea. My CPU doesn't seem to boost all that high though, I don't remember > the performance mode making much of a difference the last time I tested it. I totally agree with you -- I just wanted to make sure that my patch was conform enough with the rest of the codebae before making further progress :] > Also, for what it's worth I had a conversation with one of the folks who wrote > the platform profile code (Enver Balalic) a few months ago, and they said the > profiles are just fan speed control modes on their Omen model. I've made some reverse engineering on the Omen Control Center app through a Windows VM, and I came across a few WMI calls in a class (IIRC `HpaClient`, or something similar to that name) that do reads to this fan curve. I haven't yet found the parts that do writes unfortunately, that also needs to be browsed through. :[ The ACPI table that handles those WMI "methods" is the `SSDT` one -- I've disassembled it with `iasl`, which really helped figuring out the expected data structures. There's also a post on Reddit which talks about this feature; since this was posted 2 years ago, I'd say that at least *some* models support this -- but maybe I'm just misinterpreting it? https://www.reddit.com/r/HPOmen/comments/poxe2i/new_hp_omen_update_adds_option_for_manual_fan/ > I ended up just testing the patch for myself (after adding my model number to > the arrays), and it does improve the GPU performance further for me, instead > of the GPU dropping to 75W after 2-4 minutes it is now able to maintain at least > 100W even after 10 minutes (tested with Quake 2 RTX). So, it seems like the timer > thing also applies to my Omen 16-n0000 (8A42). That performance also roughly > matches up with how notebookcheck.net says my Omen performs in their review > (103W performance, 72W balanced), so it'd be great if you can also include my > model in your patch. Glad to hear that it helps! Your model will be part of both lists next time I send my updated patches for a "refreshed" review :] > I just ran more tests with the CPU performance, and it seems that I might have > misremembered how bad the fan curve was, since I have been limiting the CPU > frequency to 4GHz instead of letting the CPU do its thing by itself (max boost is > 4.7GHz), if I go back to the latter then with a heavily multithreaded workload > (like compiling the kernel) the fans ramp up high within a few seconds of the > CPU reaching 100C on its hottest core, and the CPU seems to maintain that > temperature (or the sensors just don't report values above 100C, not sure). That > seems worrying given that the supposed max operating temperature for the CPU > (Ryzen 7 6800H) is 95C, but then again that probably won't be the case when gaming, > which is the main use case for these laptops anyway. This is definitely problematic yet also what I kind of experience, even though we both have a completely different combination of CPUs and GPUs -- at least we can say that the "draft" patches work regardless of the backed hardware, which is good to hear :] Thanks for testing out my patch -- as I've said to Ilpo, I won't be able to do much progress for a few weeks BUT I'm still on it as soon as I'm available again! Alexis