Re: Kernel 6.7+ broke under-powering of my RX 6700XT. (Archlinux, mesa/amdgpu)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello everyone,

patch by user @fililip was posted there, but not submitted:

"I think I'd have to submit it to the linux kernel mailing list, which I am kinda scared of 😅. It could be better to submit that patch to Arch Linux maintainers; they could include it in their kernel builds."

Implementation of this patch can be simplified by simply setting:

smu->min_power_limit = amdgpu_ignore_min_pcap ? 0 : whatever_default_smuxx;

and then leave rest of the code unchanged(except defining amdgpu_ignore_min_pcap variable of course). Nothing tricky nor need to revert anything should be needed I hope. Please add it to the general kernel as an option, it certainly should not be related to Archlinux only.

Roman



On 2/19/24 12:15, Linux regression tracking (Thorsten Leemhuis) wrote:
On 17.02.24 14:30, Greg KH wrote:
On Sat, Feb 17, 2024 at 02:01:54PM +0100, Roman Benes wrote:
Minimum power limit on latest(6.7+) kernels is 190W for my GPU (RX 6700XT,
mesa, archlinux) and I cannot get power cap as low as before(to 115W),
neither with Corectrl, LACT or TuxClocker and /sys have a variable read-only
even for root. This is not of above apps issue but of the kernel, I read
similar issues from other bug reports of above apps. I downgraded to v6.6.10
kernel and my 115W(under power)cap work again as before.
Any chance you can use 'git bisect' to figure out the offending change?
For the record and everyone that lands here: the cause is known now
(it's 1958946858a62b ("drm/amd/pm: Support for getting power1_cap_min
value") [v6.7-rc1]) and the issue afaics tracked here:

https://gitlab.freedesktop.org/drm/amd/-/issues/3183

Other mentions:
https://gitlab.freedesktop.org/drm/amd/-/issues/3137
https://gitlab.freedesktop.org/drm/amd/-/issues/2992

Haven't seen any statement from the amdgpu developers (now CCed) yet on
this there (but might have missed something!). From what I can see I
assume this will likely be somewhat tricky to handle, as a revert
overall might be a bad idea here. We'll see I guess.

Roman posted something that apparently was meant to go to the list, so
let me put it here:

"""
UPDATE: User fililip already posted patch, but it need to be merged,
discussion is on gitlab link below.

(PS: I hope I am replying correctly to "all" now? - using original addr.)


it seems that commit was already found(see user's 'fililip' comment):

https://gitlab.freedesktop.org/drm/amd/-/issues/3183
commit 1958946858a62b6b5392ed075aa219d199bcae39
Author: Ma Jun <Jun.Ma2@xxxxxxx>
Date:   Thu Oct 12 09:33:45 2023 +0800

     drm/amd/pm: Support for getting power1_cap_min value

     Support for getting power1_cap_min value on smu13 and smu11.
     For other Asics, we still use 0 as the default value.

     Signed-off-by: Ma Jun <Jun.Ma2@xxxxxxx>
     Reviewed-by: Kenneth Feng <kenneth.feng@xxxxxxx>
     Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>

However, this is not good as it remove under-powering range too far. I
was getting only about 7% less performance but 90W(!) less consumption
when set to my 115W before. Also I wonder if we as a OS of options and
freedom have to stick to such very high reference for min values without
ability to override them through some sys ctrls. Commit was done by amd
guy and I wonder if because of maybe this post that I made few months
ago(business strategy?):

https://www.reddit.com/r/Amd/comments/183gye7/rx_6700xt_from_230w_to_capped_115w_at_only_10/
This is not a dangerous OC upwards where I can understand desire to
protect HW, it is downward, having min cap at 190W when card pull on
115W almost same speed is IMO crazy to deny. We don't talk about default
or reference values here either, just a move to lower the range of
options for whatever reason.
I don't know how much power you guys have over them, but please
consider either reverting this change, or give us an option to set
min_cap through say /sys (right now param is readonly, even for root).

Thank you in advance for looking into this, with regards:  Romano
"""

And while at it, let me add this issue to the tracking as well

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot introduced 1958946858a62b /
#regzbot title drm: amdgpu: under-powering broke

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux