Hi guys,
Am 10.08.20 um 08:43 schrieb Alexander Monakov:
Am 10.08.20 um 08:43 schrieb Alexander Monakov:
Hi, you should Сс a specialized mailing list and a relevant maintainer, otherwise your email is likely to be ignored as LKML is an incredibly high-volume list. Adding amd-gfx and Alex Deucher.
Thanks for forwarding this. AFAIK we haven't heard of this bug before, but Alex already might know more about it.
More thoughts below. On Sun, 9 Aug 2020, Ignat Insarov wrote:Hello! This is an issue report. I am not familiar with the Linux kernel development procedure, so please direct me to a more appropriate or specialized medium if this is not the right avenue. My laptop (Ryzen 7 Pro CPU/GPU) boots into dark screen more often than not. Screen blackness correlates with a line in the `systemd` journal that says `RAM width Nbits DDR4`, where N is either 128 (resulting in dark screen) or 64 (resulting in a healthy boot). The number seems to be chosen at random with bias towards 128. This has been going on for a while so here is some statistics: * 356 boots proceed far enough to attempt mode setting. * 82 boots set RAM width to 64 bits and presumably succeed. * 274 boots set RAM width to 128 bits and presumably fail. The issue is prevented with the `nomodeset` kernel option. I reported this previously (about a year ago) on the forum of my Linux distribution.[1] The issue still persists as of linux 5.8.0. The details of my graphics controller, as well as some journal excerpts, can be seen at [1]. One thing that has changed since then is that on failure, there now appears a null pointer dereference error. I am attaching the log of kernel messages from the most recent failed boot — please request more information if needed. I appreciate any directions and advice as to how I may go about fixing this annoyance. [1]: https://bbs.archlinux.org/viewtopic.php?id=248273On the forum you show that in the "success" case there's one less "BIOS signature incorrect" message. This implies that amdgpu_get_bios() in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c gets the video BIOS from a different source. If that happens every time (one "signature incorrect" message for "success", two for "failure") that may be relevant to the problem you're experiencing. If you don't mind patching and rebuilding the kernel I suggest adding debug printks to the aforementioned function to see exactly which methods fail with wrong signature and which succeeds. Also might be worthwhile to check if there's a BIOS update for your laptop.
It might also be a good idea to try the latest amd-staging-drm-next branch from Alex repository (bear with me I don't have the link at hand, but it should be easy to find).
Opening a bug report or searching the existing ones for something similar under https://gitlab.freedesktop.org/drm/amd/-/issues might be a good idea as well.
And I completely agree that this sounds like an issue getting the BIOS image.
Thanks,
Christian.
Alexander
_______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx