Nice work. Thanks for tracking this down! Alex On Tue, Oct 30, 2018 at 12:32 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > > > On Mon, 29 Oct 2018, Alex Deucher wrote: > > > On Thu, Oct 25, 2018 at 4:46 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Wed, 24 Oct 2018, Mikulas Patocka wrote: > > > > > > > Hi > > > > > > > > I have a Sapphire Pulse RX 570 ITX graphics card. > > > > > > > > On Linux, I get errors "amdgpu: [powerplay] failed to send message 148 ret > > > > is 0" and the system is stuck for several seconds when they happen. The > > > > card works, except for these errors and occasional delays. > > > > > > I've found that PP_PCIE_DPM_MASK causes there errors. If I turn this bit > > > off in amdgpu.ppfeaturemask, there are no more any errors. (and turning it > > > off also fixes hibernation problems) > > > > > > Should it be turned off automatically in response to these errors? > > > > What platform are you running on? Are you running in a VM? The > > driver accesses pci config space on the bridge to determine the pcie > > gen and lane caps of the platform to determine what clocks and lanes > > are valid. See amdgpu_device_get_pcie_info(). It would be good to > > figure out why this is not working on your platform. > > > > Alex > > It's not a VM. It's an old motherboard with dual socket F. It has HT2000 > north bridge and HT1000 south bridge. It has two PCIe-v1 8-lane slots. > > I've found the bug - pcie_get_speed_cap incorrectly tests the lnkcap > variable against values that are not bit-masks, so that the PCIe port is > incorrectly reported as 8GB/s capable. When I fix these tests, the errors > are gone. > > Mikulas _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx