Can disable evictions for page tables, e.g. by removing them from the LRU list? Marek On Thu, May 29, 2014 at 6:30 PM, Christian König <deathsimple@xxxxxxxxxxx> wrote: > Hi Marek & Alex, > > I've found the issue why forcefully evicting page tables sometimes crashes > the box. > > Well this is a typical hexdump page table before it is moved to GART: > 000117f000 02914061 00000000 > 000117f008 02915061 00000000 > 000117f010 02916061 00000000 > 000117f018 02917061 00000000 > 000117f020 02918061 00000000 > > And it looks like this when it comes back: > 0006102000 00000000 00000000 > * > > Ideas? I don't really have an explanation for this. Moving buffers around > otherwise seems to work perfectly fine. > > Thanks, > Christian. > > Am 28.05.2014 12:38, schrieb Christian König: > >> I already tried a similar patch as well, without any more noticeable >> crashes. But going to give this another round with your patch and openarena. >> >> Thanks, >> Christian. >> >> Am 27.05.2014 23:55, schrieb Marek Olšák: >>> >>> Hi Christian, >>> >>> I test on Bonaire (ChipID = 0x665c). Unfortunately, the hangs are not >>> fixed yet. They are very rare and very random. Therefore, I have come >>> up with a patch which evicts page tables between IBs. See the >>> attachment. With that patch applied, the system starts fine, compiz >>> and glxgears work, but once I start playing openarena, it locks up >>> pretty quickly. >>> >>> The patch shouldn't do anything in theory, because pages are moved >>> back to VRAM immediately after that. However, the VRAM address of page >>> tables may end up being different from before, which might be the root >>> cause. >>> >>> Marek >>> >>> On Wed, May 14, 2014 at 2:11 PM, Christian König >>> <deathsimple@xxxxxxxxxxx> wrote: >>>> >>>> Crap, any chance you can narrow it down a bit more? >>>> >>>> I've just tried a piglit quick test on my Bonaire and it seems to work >>>> perfectly fine. >>>> >>>> What hw do you test on? >>>> >>>> Regards, >>>> Christian. >>>> >>>> Am 13.05.2014 23:21, schrieb Marek Olšák: >>>> >>>>> Hi Christian, >>>>> >>>>> Even though some regressions are fixed by these patches: >>>>> >>>>> drm/radeon: fix page directory update size estimation >>>>> drm/radeon: fix buffer placement under memory pressure v2 >>>>> >>>>> and indeed, the texelFetch tests no longer hang, there is one more >>>>> hang which needs to be fixed. :( All I know is the exact same commit >>>>> causes it and it can only be reproduced by running whole piglit with >>>>> concurrency enabled. >>>>> >>>>> My kernel git log: >>>>> >>>>> * 2ba22c8 - drm/radeon: fix buffer placement under memory pressure v2 >>>>> (10 hours ago) <Christian König> >>>>> * 3af91e5 - drm/radeon: fix page directory update size estimation (21 >>>>> hours ago) <Christian König> >>>>> * 6d2f294 - drm/radeon: use normal BOs for the page tables v4 (2 >>>>> months ago) <Christian König> >>>>> * fa68834 - drm/radeon: further cleanup vm flushing & fencing (2 >>>>> months ago) <Christian König> >>>>> >>>>> fa68834 doesn't hang, but 2ba22c8 hangs, which means 6d2f294 or either >>>>> of the two fixes is the first bad commit. >>>>> >>>>> Marek >>>>> >>>>> On Fri, May 9, 2014 at 8:03 PM, Marek Olšák <maraeo@xxxxxxxxx> wrote: >>>>>> >>>>>> Hi Christian, >>>>>> >>>>>> This commit which first appeared in 3.15-rc1 causes hangs on Bonaire: >>>>>> >>>>>> commit 6d2f2944e95e504a7d33385eeeb9bb7fcca72592 >>>>>> Author: Christian König <christian.koenig@xxxxxxx> >>>>>> Date: Thu Feb 20 13:42:17 2014 +0100 >>>>>> >>>>>> drm/radeon: use normal BOs for the page tables v4 >>>>>> >>>>>> No need to make it more complicated than necessary, >>>>>> just allocate the page tables as normal BO and >>>>>> flush whenever the address change. >>>>>> >>>>>> v2: update comments and function name >>>>>> v3: squash bug fixes, page directory and tables patch >>>>>> v4: rebased on Mareks changes >>>>>> >>>>>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> >>>>>> >>>>>> >>>>>> Reverting the commit gives me a lot of merge conflicts. >>>>>> >>>>>> The simplest way to reproduce the hangs is to run piglit with these >>>>>> parameters: >>>>>> -t texelFetch.fs >>>>>> >>>>>> Some of the tests allocate a lot of MSAA textures and the tests also >>>>>> run in parallel, which creates a lot of memory pressure and probably >>>>>> causes buffer evictions. >>>>>> >>>>>> Any idea what is wrong with it? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Marek >>>> >>>> >> > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel