https://bugzilla.kernel.org/show_bug.cgi?id=219611 Bug ID: 219611 Summary: Read of pcie_bw sysfs file on AMD GPU blocks for 1 second Product: Drivers Version: 2.5 Hardware: Intel OS: Linux Status: NEW Severity: normal Priority: P3 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@xxxxxxxxxxxxxxxxxxxx Reporter: yumpusamongus+kernelbugzilla@xxxxxxxxx Regression: No Multiple cases of userspace resource monitors getting tripped up by this: https://github.com/Syllo/nvtop/issues/139 https://github.com/Syllo/nvtop/issues/208 https://github.com/aristocratos/btop/issues/793 https://gitlab.com/mission-center-devs/mission-center/-/issues/309 The behavior is highly unusual and would require special treatment of just that file in userspace. The docs say "The amdgpu driver provides a sysfs API for estimating how much data has been received and sent by the GPU in the last second through PCIe". Specifically, the LAST second, not the second starting when read() was called. The culprit, as far as I can tell, is the msleep here: https://elixir.bootlin.com/linux/v6.12.4/source/drivers/gpu/drm/amd/amdgpu/soc15.c#L756 (the same code is copy-pasted in 4 places). I am not familiar with the intricacies of AMD GPUs, but what would be the cost to having those counters enabled all the time, and reporting the number of messages in some recent second? Or even better, ripping this out and exposing the integrating message counts directly, so userspace can choose whichever sample rate it wants? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.