Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I can confirm that setting amdgpu.sg_display=0 does not fix the issue for me.

I have 64GB of Kinsten Memory running with XMP at 5200MHz. I attached the result of `dmidecode --type=memory` to this email.

Kind regards
Felix Richter

On 05.06.23 17:27, Hamza Mahfooz wrote:

On 6/3/23 10:52, Felix Richter wrote:
Hi Guys,

sorry for the silence from my side. I had a lot of things to take care of after returning from vacation. Also I had to wait on the zfs modules to be updated to support kernel 6.3 for further testing.

The bad news is that I am still experiencing issues. I have been able to get a reproducible trigger for the buggy behavior. The moment I take a screenshot or any other program like `wdisplays` accesses the screen buffer the screen starts flickering. The only way to reset it is to reboot the machine or log out of the desktop.

With this I did a bisection to figure out which commit is responsible for this. I attached the logs to the mail. The short version is that I identified commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb as the culprit. Seems that there are side effects of having more flexible buffer placement for the case of the internal GPU. To verify that this actually is the cause of the issue I built the current archlinux kernel with an extra patch to revert the commit: https://github.com/ju6ge/linux/tree/v6.3.5-ju6ge. The result is that be bug is fixed!

Now if this is the desired long term fix I do not know …

Can you provide a dmidecode of your RAM (i.e. # dmidecode --type=memory)?

The current trend seems to suggest that if you have 64 or more gigs of
RAM, you will probably still experience issues with S/G mode enabled
even with my fix applied.


Kind regards,
Felix Richter

On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:
On 02.05.23 15:48, Felix Richter wrote:
On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
On 02.05.23 15:13, Alex Deucher wrote:
On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
Leemhuis)<regressions@xxxxxxxxxxxxx>  wrote:

On 30.04.23 13:44, Felix Richter wrote:
Hi,

I am running into an issue with the integrated GPU of the Ryzen 9
7950X. It seems to be a regression from kernel version 6.1 to 6.2.
The bug materializes in from of my monitor blinking, meaning it
turns full white shortly. This happens very often so that the
system becomes unpleasant to use.

I am running the Archlinux Kernel:
The Issue happens on the bleeding edge kernel: 6.2.13
Switching back to the LTS kernel resolves the issue: 6.1.26

I have two monitors attached to the system. One 42 inch 4k Display
and a 24 inch 1080p Display and am running sway as my desktop.

Let me know if there is more information I could provide to help
narrow down the issue.
Thanks for the report. To be sure the issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced v6.1..v6.2
#regzbot title drm: amdgpu: system becomes unpleasant to use after
monitor starts blinking and turns full white
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify
when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags
pointing
to the report (the parent of this mail). See page linked in footer for
details.
This sounds exactly like the issue that was fixed in this patch which
is already on it's way to Linus:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
FWIW, you in the flood of emails likely missed that this is the same
thread where you yesterday replied "If the module parameter didn't help then perhaps you are seeing some other issue.  Can you bisect?". That's
why I decided to add this to the tracking. Or am I missing something
obvious here?

/me looks around again and can't see anything, but that doesn't have to
mean anything...

Felix, btw, this guide might help you with the bisection, even if it's
just for kernel compilation:

https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html

And to indirectly reply to your mail from yesterday[1]. You might want to ignore the arch linux kernel git repo and just do a bisection between 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
I'd also try 6.3 or even mainline before that, in case the issue was
fixed already.

[1]
https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@xxxxxxxxxxxxxxxxx/

Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
the newest commit.
FWIW, I wonder what you actually mean with "newest commit" here: a
bisection between 6.1 and mainline HEAD might be a waste of time, *if*
this is something that only happens in 6.2.y (say due to a broken or
incomplete backport)

That was the part I was mostly unsure about … where
to start from.

I was planning to use PKGBUILD scripts from arch to achieve the same
configuration as I would when installing
the package and just rewrite the script to use a local copy of the
source code instead of the repository.
That way I can just use the bisect command, rebuild the package and test
again.
In my experience trying to deal with Linux distro's package managers
creates more trouble than it's worth.

But I probably won't be able to finish it this week, since I am on
vacation starting tomorrow and will not have access to the computer in
question. I will be back next week, by that time the patch Alex is
talking about might
already be in mainline. So if that fixes it, I will notice and let you
know. If not I will do the bisection to figure out what the actual issue
is.
Enjoy your vacation!

Ciao, Thorsten
# dmidecode 3.5
Getting SMBIOS data from sysfs.
SMBIOS 3.5.0 present.

Handle 0x000F, DMI type 16, 23 bytes
Physical Memory Array
	Location: System Board Or Motherboard
	Use: System Memory
	Error Correction Type: None
	Maximum Capacity: 128 GB
	Error Information Handle: 0x000E
	Number Of Devices: 4

Handle 0x0012, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0011
	Total Width: Unknown
	Data Width: Unknown
	Size: No Module Installed
	Form Factor: Unknown
	Set: None
	Locator: DIMM 0
	Bank Locator: P0 CHANNEL A
	Type: Unknown
	Type Detail: Unknown

Handle 0x0014, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0013
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: DIMM
	Set: None
	Locator: DIMM 1
	Bank Locator: P0 CHANNEL A
	Type: DDR5
	Type Detail: Synchronous Unbuffered (Unregistered)
	Speed: 5200 MT/s
	Manufacturer: Kingston
	Serial Number: D10C970D
	Asset Tag: Not Specified
	Part Number: KF552C40-32         
	Rank: 2
	Configured Memory Speed: 5200 MT/s
	Minimum Voltage: 1.1 V
	Maximum Voltage: 1.1 V
	Configured Voltage: 1.1 V
	Memory Technology: DRAM
	Memory Operating Mode Capability: Volatile memory
	Firmware Version: Unknown
	Module Manufacturer ID: Bank 2, Hex 0x98
	Module Product ID: Unknown
	Memory Subsystem Controller Manufacturer ID: Unknown
	Memory Subsystem Controller Product ID: Unknown
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None

Handle 0x0017, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0016
	Total Width: Unknown
	Data Width: Unknown
	Size: No Module Installed
	Form Factor: Unknown
	Set: None
	Locator: DIMM 0
	Bank Locator: P0 CHANNEL B
	Type: Unknown
	Type Detail: Unknown

Handle 0x0019, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0018
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: DIMM
	Set: None
	Locator: DIMM 1
	Bank Locator: P0 CHANNEL B
	Type: DDR5
	Type Detail: Synchronous Unbuffered (Unregistered)
	Speed: 5200 MT/s
	Manufacturer: Kingston
	Serial Number: D50C9730
	Asset Tag: Not Specified
	Part Number: KF552C40-32         
	Rank: 2
	Configured Memory Speed: 5200 MT/s
	Minimum Voltage: 1.1 V
	Maximum Voltage: 1.1 V
	Configured Voltage: 1.1 V
	Memory Technology: DRAM
	Memory Operating Mode Capability: Volatile memory
	Firmware Version: Unknown
	Module Manufacturer ID: Bank 2, Hex 0x98
	Module Product ID: Unknown
	Memory Subsystem Controller Manufacturer ID: Unknown
	Memory Subsystem Controller Product ID: Unknown
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None


[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux