On 18.06.2016 13:16, Mads wrote: > That was quick! :) > > On 2016-06-18 11:28, Nicolai Hähnle wrote: >> >> Since you've tried a lot of kernel variations, I'm tempted to look for >> the problem in Mesa. A couple of things you could try: >> >> 1) Run R600_DEBUG=testdma,check_vm glxgears (or any other GL app, >> really). This executes a DMA self-test. Observe whether there are any >> failures and whether you get VM faults associated to the run in dmesg. >> (The self-test runs indefinitely, until you Ctrl+C out of it.) > > last line before ctrl+c: > 342: dst = ( 80 x 104 x 1, 2D_TILED_THIN1), src = ( 1164 x 1940 > x 1, 2D_TILED_THIN1), bpp = 16, BLITs: GFX = 30, DMA = 0, pass [343/343] > > It didn't seem to cause any issues, no messages in dmesg... > >> 2) Start your desktop session with R600_DEBUG=nodma and see if that >> makes the VM faults go away. (Please make sure that the environment >> variable actually makes it through, by looking at /proc/$pid/environ, >> where $pid is the PID of kwin and other relevant processes.) > > It set it globally, and I could see krunner's environ-file containing > R600_DEBUG=nodma. Still corruption, graphical lock up and this output > from dmesg after starting dolphin: > > [ 1188.562864] amdgpu 0000:00:01.0: GPU fault detected: 146 0x0842b714 > [ 1188.562870] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR > 0x00101508 > [ 1188.562872] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS > 0x0D0B7014 > [ 1188.562875] VM fault (0x14, vmid 6) at page 1053960, write from > 'SDM0' (0x53444d30) (183) .. snip .. That's surprising. Would CP DMA also appear as write from 'SDM0'? I doubt it... >> 3) Do dolphin and konsole use OpenGL directly in your setting, or is >> it just the compositor? >> > I don't think they're special...? I wouldn't know where to setup that > kind of setting, so I'm guessing it's the compositor. A sanity check is `grep radeonsi /proc/$pid/maps` -- if something shows up, the driver was loaded into the process. >> 4) Something else I notice is that the page numbers of the VM faults >> are of the form 0x001xxxxx. This suggest a 32-bit address underflow, >> i.e. an address wraps around to a very large 32-bit number. Could you >> please install a version of Mesa with assertions enabled >> (--enable-debug in ./configure does the trick) and see if some check >> is triggered? > > I'll do this next, it takes a while to build so I'll reply as soon as I > have it :) It is a 64 bit system though, but I have both 64bit libs and > 32bits libs installed (I can't think of anything that should be running > that would be 32-bit...) That doesn't really matter though. Even though the system is 64 bits and the GPUVM has a 40 bit address space, the GPU still takes plenty of address-related offsets as 32 bits or less. Cheers, Nicolai > > Thanks for help! > > - Mads