On Sun, 21 Oct 2018 19:15:01 +0530 Suvayu Ali <fatkasuvayu+linux@xxxxxxxxx> wrote: > Sorry for the late response, I wasn't monitoring the list. Not a problem. > I actually thought of what you mentioned in the other post, the > firmware making a persistent change in some register. I really don't > know how I could investigate that :-| Without access to the hardware specs, the only way would be to ask someone in the know. That would be the manufacturer of the hardware. After looking at the code in the kernel, I think this is less likely; if the firmware has never loaded, there can be no changed settings in the device because of the firmware. > I have tried modprobe, lsmod, etc, and I see the amdgpu module is > loaded. I don't know how to check something for firmware. I guess > the issue is, the module loads, but it fails to find and load the > binary blob (firmware) that it needs (as seen in the journal). So > the picture is consistent, I just can't figure out why it fails to > find the firmware files. Does Fedora use a firmware loader? Maybe I > can hack around and invoke it in "verbose mode". If it is something > internal to the kernel, I doubt something like that would be possible. You could compare the journal messages from the older firmware when it succeeded with those from the failing firmware to see if there is any difference. For that matter, compare the successful old load with the failing old load. But I don't think it is failing to find the firmware files that exist. The AMD firmware is loaded in drivers/gpu/drm/amd/amdgpu/ampgpu_device.c In that source, there are two other firmware blobs treated the same as the raven: vega10 and vega12. Do you find those in the same place as the raven firmware blob? Do they have the same permissions? This is complex code, but the path is OK because there is an explicit check, and it passes. > I have tried rpm --verify, no luck. I will try your suggestion and > see if that points to something. My plan of last resort is to do a > fresh install when F29 is released, but understanding and solving > this issue would be far more satisfying :). After looking at the kernel code, I don't think this is the problem, it is failing in kernel_read_file_from_path after a call chain from the above device loader. It is as if there is a mismatch of the manifest of files to load and the actual files present. To me that says that the firmware is faulty because it is internally inconsistent. But don't take that to the bank, it's just supposition from reading code that I don't really understand fully. I looked at the code for kernel 4.19, so it might be different than 4.18. Because there is a specific path for raven, I'm surprised that the older firmware actually worked. The kernel looks for firmware with raven in the name, and I didn't see a fallback, though I could have missed it because everything is done via pointers to structs, and I didn't get into that level of detail. Finally, could there be leftovers from the Rocm code causing problems? _______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx