Hi Nocolai, If we don't already have an option for this try to double the size of the VM area allocate for each BO in userspace. That should give you a nice hole between each BO and so should help to catch cases when somebody writes over the end of a BO. Regards, Christian. Am 22.06.2016 um 09:50 schrieb Nicolai Hähnle: > Hi Mads, > > setting R600_DEBUG=nodma in the X server should work around your > problem for now. > > Marek, perhaps an out-of-bounds check for tiled texture memory access > similar to the linear access check is necessary? I wonder if you've > seen something about that in the docs. > > I've annotated the sDMA IB dump. It's a linear-to-display-tiled copy > on Carrizo. I tried to reproduce with the attached patch, but failed > to do so even with amdgpu.vm_debug=1. With the patch, I get DMA copies > that are identical to the one that causes the VM fault except for a > different bank_height and macro_tile_aspect, so the issue is likely > related to those. > > Nicolai > > On 21.06.2016 19:32, Nicolai Hähnle wrote: >> On 21.06.2016 19:16, Mads wrote: >>> I sent this for 1.5 hours ago, but since it hasn't arrived to the >>> mailing list yet, I try again... >> >> It arrived, no worries :) >> >> I'll take a look later. >> >> Nicolai >> >>> >>> On 2016-06-21 17:48, Mads wrote: >>> >>>> On 2016-06-21 10:12, Mads wrote: >>>> >>>> On 2016-06-21 09:39, Nicolai Hähnle wrote: >>>> >>>> Thanks. However, I still don't think this is going to help. Your >>>> earlier trace experiments showed that the problematic SDMA commands >>>> came from the X server, _not_ from plasmashell. >>>> >>>> So what we see here is likely just the first set of GPU commands sent >>>> by plasmashell after the VM fault occurred. Since the plasmashell >>>> process is unable to tell who caused the VM fault, it takes the blame >>>> incorrectly. Are you sure the X server is using your self-compiled >>>> radeonsi_dri.so and has the environment variable set? If it creates a >>>> ddebug_dump, it might be somewhere else (it's based off the HOME >>>> environment variable, which may be different). >>>> I'll take a second look to see if there's an X dump there too, but >>>> unfortunately it'll be in about ~8 hours before I have the machine at >>>> hand again.. >>>> >>>> And yes, I'm sure, everything is built through portage, so there is no >>>> "self-compiled" on the system per se. There's always just one lib >>>> available at any time :) >>> >>> You were right! X didn't have R600_DEBUG=check_vm in environment (no >>> login shell/sourcing of /etc/profile). >>> >>> Here's what i ran: >>> >>>> $ XAUTHORITY=.Xauthority DISPLAY=:0 LIBGL_DEBUG=verbose dolphin >>>> libGL: pci id for fd 9: 1002:9874, driver radeonsi >>>> libGL: OpenDriver: trying /usr/lib64/dri/tls/radeonsi_dri.so >>>> libGL: OpenDriver: trying /usr/lib64/dri/radeonsi_dri.so >>>> si_vm_fault_occured: failed to parse line ' Either >>>> enable ECC checking or force module loading by setting >>>> 'ecc_enable_override'. >>>> ' >>>> libGL: Using DRI3 for screen 0 >>>> Trying to convert empty KLocalizedString to QString. >>>> Cannot creat accessible child interface for object: >>>> PlacesView(0x118d670) index: 5 >>>> QPixmap::scaled: Pixmap is a null pixmap >>>> QPixmap::scaled: Pixmap is a null pixmap >>>> (... etc ...) >>>> The X11 connection broke (error 1). Did the X11 server die? >>> >>> Attaching dmesg and ddebug_dump. >>> >>> - Mads > > > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20160622/90fc9c42/attachment-0001.html>