On 18.06.2016 10:15, Mads wrote: > Hi! > > For a while now I've been having issues with my HP EliteDesk 705 G2 mini > PC[1]. > > If I open up e.g. dolphin or konsole when in kde plasma 5.6.4, the > screen corrupts and locks up, and this appears in dmesg: > > juni 17 22:50:42 hphtpc kernel: amdgpu 0000:00:01.0: GPU fault detected: > 146 0x0842b714 > juni 17 22:50:42 hphtpc kernel: amdgpu 0000:00:01.0: > VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00101508 > juni 17 22:50:42 hphtpc kernel: amdgpu 0000:00:01.0: > VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0B0B7014 > juni 17 22:50:42 hphtpc kernel: VM fault (0x14, vmid 5) at page 1053960, > write from 'SDM0' (0x53444d30) (183) ... snip ... > This didn't happen back with mesa-11.2.2 built against llvm 3.8.0, but > that starts to be quite a lot of commits ago now, considering the > development pace mesa's got at the moment. > > I tried out mesa and llvm from git and svn around when Bas Nieuwenhuizen > posted those GL compute shaders for radeonsi patches[2], and I think > that's when it was the first time I saw the bug. Compute shaders aren't used by a plain desktop, but the VM fault indicates a write from the SDMA engine, which also saw a lot more use during that timeframe. > It seems that the bug appears no matter what kernel I try to use, I've > been through countless iterations of drm-next-4.7 kernels and > drm-fixes-4.6 kernels, but it seems to happen no matter what I use. The > error message pasted above comes from gentoo provided 4.6.2-kernel: > > # uname -a > Linux hphtpc 4.6.2-gentoo #2 SMP PREEMPT Mon Jun 13 21:27:32 CEST 2016 > x86_64 AMD PRO A12-8800B R7, 12 Compute Cores 4C+8G AuthenticAMD GNU/Linux > > Am I at the right mailing list for this kind of bug? How can I debug > this further? Since you've tried a lot of kernel variations, I'm tempted to look for the problem in Mesa. A couple of things you could try: 1) Run R600_DEBUG=testdma,check_vm glxgears (or any other GL app, really). This executes a DMA self-test. Observe whether there are any failures and whether you get VM faults associated to the run in dmesg. (The self-test runs indefinitely, until you Ctrl+C out of it.) 2) Start your desktop session with R600_DEBUG=nodma and see if that makes the VM faults go away. (Please make sure that the environment variable actually makes it through, by looking at /proc/$pid/environ, where $pid is the PID of kwin and other relevant processes.) 3) Do dolphin and konsole use OpenGL directly in your setting, or is it just the compositor? 4) Something else I notice is that the page numbers of the VM faults are of the form 0x001xxxxx. This suggest a 32-bit address underflow, i.e. an address wraps around to a very large 32-bit number. Could you please install a version of Mesa with assertions enabled (--enable-debug in ./configure does the trick) and see if some check is triggered? Nicolai > > - Mads > > --------- > [1] > http://store.hp.com/us/en/PDPStdView?catalogId=10051&urlLangId=-1&langId=-1&productId=1086676&storeId=10151 > > [2] https://lists.freedesktop.org/archives/mesa-dev/2016-April/111638.html > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx