Hi Mads, setting R600_DEBUG=nodma in the X server should work around your problem for now. Marek, perhaps an out-of-bounds check for tiled texture memory access similar to the linear access check is necessary? I wonder if you've seen something about that in the docs. I've annotated the sDMA IB dump. It's a linear-to-display-tiled copy on Carrizo. I tried to reproduce with the attached patch, but failed to do so even with amdgpu.vm_debug=1. With the patch, I get DMA copies that are identical to the one that causes the VM fault except for a different bank_height and macro_tile_aspect, so the issue is likely related to those. Nicolai On 21.06.2016 19:32, Nicolai Hähnle wrote: > On 21.06.2016 19:16, Mads wrote: >> I sent this for 1.5 hours ago, but since it hasn't arrived to the >> mailing list yet, I try again... > > It arrived, no worries :) > > I'll take a look later. > > Nicolai > >> >> On 2016-06-21 17:48, Mads wrote: >> >>> On 2016-06-21 10:12, Mads wrote: >>> >>> On 2016-06-21 09:39, Nicolai Hähnle wrote: >>> >>> Thanks. However, I still don't think this is going to help. Your >>> earlier trace experiments showed that the problematic SDMA commands >>> came from the X server, _not_ from plasmashell. >>> >>> So what we see here is likely just the first set of GPU commands sent >>> by plasmashell after the VM fault occurred. Since the plasmashell >>> process is unable to tell who caused the VM fault, it takes the blame >>> incorrectly. Are you sure the X server is using your self-compiled >>> radeonsi_dri.so and has the environment variable set? If it creates a >>> ddebug_dump, it might be somewhere else (it's based off the HOME >>> environment variable, which may be different). >>> I'll take a second look to see if there's an X dump there too, but >>> unfortunately it'll be in about ~8 hours before I have the machine at >>> hand again.. >>> >>> And yes, I'm sure, everything is built through portage, so there is no >>> "self-compiled" on the system per se. There's always just one lib >>> available at any time :) >> >> You were right! X didn't have R600_DEBUG=check_vm in environment (no >> login shell/sourcing of /etc/profile). >> >> Here's what i ran: >> >>> $ XAUTHORITY=.Xauthority DISPLAY=:0 LIBGL_DEBUG=verbose dolphin >>> libGL: pci id for fd 9: 1002:9874, driver radeonsi >>> libGL: OpenDriver: trying /usr/lib64/dri/tls/radeonsi_dri.so >>> libGL: OpenDriver: trying /usr/lib64/dri/radeonsi_dri.so >>> si_vm_fault_occured: failed to parse line ' Either >>> enable ECC checking or force module loading by setting >>> 'ecc_enable_override'. >>> ' >>> libGL: Using DRI3 for screen 0 >>> Trying to convert empty KLocalizedString to QString. >>> Cannot creat accessible child interface for object: >>> PlacesView(0x118d670) index: 5 >>> QPixmap::scaled: Pixmap is a null pixmap >>> QPixmap::scaled: Pixmap is a null pixmap >>> (... etc ...) >>> The X11 connection broke (error 1). Did the X11 server die? >> >> Attaching dmesg and ddebug_dump. >> >> - Mads -------------- next part -------------- VM fault report. Driver vendor: X.Org Device vendor: AMD Device name: AMD CARRIZO (DRM 3.1.0 / 4.6.2-gentoo, LLVM 3.9.0) Failing VM page: 0x00101508 Buffer list (in units of pages = 4kB): [1;33m Size VM start page VM end page Usage[0m 8 0x0000000100035 0x000000010003d IB1 843 -- hole -- 975 0x0000000100388 0x0000000100757 SDMA_BUFFER 2473 -- hole -- 1032 0x0000000101100 0x0000000101508 SDMA_BUFFER Note: The holes represent memory not used by the IB. Other buffers can still be allocated there. ------------------ sDMA IB begin ------------------ 00000501 COPY, TILED_SUB_WINDOW 01100000 tiled_address_lo 00000001 tiled_address_hi 001d0000 tiled_x = 0, tiled_y = 29 00ab0000 tiled_z = 0, pitch_tile_max = 0xab = 171 0000407f slice_tile_max = 0x407f = 16511 02481822 00388000 linear_address_lo 00000001 linear_address_hi 00000000 linear_x = 0, linear_y = 0 057f0000 linear_z = 0, linear_pitch = 0x580 = 1408 000f3b7f linear_slice_pitch = 0xf3b80 = 998272 02c40555 copy_width_aligned = 0x556 = 1366, copy_height = 709 00000000 copy_depth = 1 00000000 NOP ------------------- sDMA IB end ------------------- linear_height = 709 log(bpe) = 2, bpe = 4 array_mode = 4 (ARRAY_2D_TILED_THIN1) micro_tile_mode = 0 (DISPLAY_MICRO_TILING) log(tile_split) = 3 bank_width = 0 bank_height = 2 num_banks = 2 macro_tile_aspect = 2 pipe_config = 0 tiled_pitch = 172 * 8 = 1376 tiled_slice_pitch = 16512 * 64 = 1056768 -> tiled_height = 768 My Carrizo: tile bits 01401822 bank_height = 0 num_banks = 2 macro_tile_aspect = 1 SDMA Dump Done. -------------- next part -------------- A non-text attachment was scrubbed... Name: reproduction-attempt.patch Type: text/x-patch Size: 3783 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20160622/c49aae43/attachment.bin>