Grigori, you can git-checkout the commit before and after the memory management changes, compile both and test them. Marek On Fri, May 30, 2014 at 2:30 AM, Grigori Goronzy <greg@xxxxxxxxxxxx> wrote: > On 13.05.2014 22:27, Marek Olšák wrote: >> >> I applied these two patches Christian sent to dri-devel: >> >> drm/radeon: fix page directory update size estimation >> drm/radeon: fix buffer placement under memory pressure v2 >> >> on top of torvalds's master branch. >> > > With latest kernel master (a991639c) I still see a regression, compared to > 3.13 or 3.14, which have similar performance. Xonotic is about 7% slower. > OpenArena and Unigine Tropics are also noticeably slower, but I didn't > record accurate numbers. > > Maybe the improved memory management has some overhead, but this is not > acceptable IMHO. I'll try to investigate further. > > Best regards > > Grigori > >> Marek >> >> On Tue, May 13, 2014 at 10:19 PM, Grigori Goronzy <greg@xxxxxxxxxxxx> >> wrote: >>> >>> On 13.05.2014 21:50, Marek Olšák wrote: >>>> >>>> >>>> Hi Christian, >>>> >>>> The performance regression I saw with piglit seems to be fixed with >>>> latest kernel git. It's difficult to bisect the kernel, because there >>>> are only merges between 3.14 and 3.15 and the merged committs are >>>> actually based on 3.14-rc1 and 3.14-rc4. >>>> >>>> All seems to be fine with your fixes. >>>> >>> >>> Which fixes have you applied? There are quite a few pending patches on >>> dri-devel, that aren't yet part of drm-fixes-3.15. >>> >>> Grigori >>> >>> >>>> Marek >>>> >>>> On Tue, May 13, 2014 at 5:31 PM, Christian König >>>> <deathsimple@xxxxxxxxxxx> wrote: >>>>> >>>>> >>>>> Is the performance regression regression caused by the page table >>>>> changes >>>>> or >>>>> something else? >>>>> >>>>> I did made some tests with xonotic while developing it and it didn't >>>>> showed >>>>> anything obvious, but I didn't made tests on different systems. >>>>> >>>>> Christian. >>>>> >>>>> Am 13.05.2014 17:19, schrieb Marek Olšák: >>>>> >>>>>> Your latest patches fix the regression. >>>>>> >>>>>> The performance regression can also be reproduced with piglit "-t >>>>>> texelFetch.fs". >>>>>> >>>>>> Kernel 3.14: >>>>>> real 0m17.724s >>>>>> user 0m41.905s >>>>>> sys 0m11.299s >>>>>> >>>>>> The problematic commit checked out + your fixes (without the PTE patch >>>>>> I >>>>>> think): >>>>>> real 0m23.474s >>>>>> user 1m1.008s >>>>>> sys 0m13.812s >>>>>> >>>>>> Marek >>>>>> >>>>>> >>>>>> On Tue, May 13, 2014 at 3:57 PM, Christian König >>>>>> <deathsimple@xxxxxxxxxxx> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Am 13.05.2014 15:22, schrieb Alex Deucher: >>>>>>> >>>>>>>> On Mon, May 12, 2014 at 7:38 PM, Grigori Goronzy <greg@xxxxxxxxxxxx> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I can confirm this fixes it for me, too. >>>>>>>>> >>>>>>>>> 3.15 with these fixes and the large PTE patches actually ends up >>>>>>>>> being >>>>>>>>> noticeably slower than earlier kernels with Xonotic, though. I >>>>>>>>> wonder >>>>>>>>> what's >>>>>>>>> going on. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Allocation overhead? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Unlikely, Xonotic just allocates a single page table at start, which >>>>>>> then >>>>>>> gets extended to a certain rate until they no longer need more >>>>>>> address >>>>>>> space >>>>>>> and are done with it. >>>>>>> >>>>>>> Grigori, can you bisect and/or try to figure out what's wrong here? >>>>>>> >>>>>>> Christian. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>> Grigori >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12.05.2014 14:50, Christian König wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I could reproduce the problem with xonotic and I think I've found >>>>>>>>>> the >>>>>>>>>> issue. >>>>>>>>>> >>>>>>>>>> Please test the attached patch. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Christian. >>>>>>>>>> >>>>>>>>>> Am 11.05.2014 11:06, schrieb Christian König: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I have tested it and it doesn't fix the hangs. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yeah, thought so. Well it was just a guess. >>>>>>>>>>> >>>>>>>>>>>> (Also, I don't like the patch, because it reverts the behavior I >>>>>>>>>>>> added >>>>>>>>>>>> for userspace buffers.) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Actually it shouldn't affect that. The alternative domain always >>>>>>>>>>> contains GART even when userspace only specified VRAM as >>>>>>>>>>> placement >>>>>>>>>>> (as >>>>>>>>>>> long as it is technical possible to do so). >>>>>>>>>>> >>>>>>>>>>> So what should happen is that TTM sees the current placement, >>>>>>>>>>> matches >>>>>>>>>>> that with the desired placement and should find that it doesn't >>>>>>>>>>> need >>>>>>>>>>> to move the buffer (we should just test if this behavior really >>>>>>>>>>> works >>>>>>>>>>> as expected). >>>>>>>>>>> >>>>>>>>>>> Christian. >>>>>>>>>>> >>>>>>>>>>> Am 10.05.2014 23:38, schrieb Marek Olšák: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi Christian, >>>>>>>>>>>> >>>>>>>>>>>> I have tested it and it doesn't fix the hangs. >>>>>>>>>>>> >>>>>>>>>>>> (Also, I don't like the patch, because it reverts the behavior I >>>>>>>>>>>> added >>>>>>>>>>>> for userspace buffers.) >>>>>>>>>>>> >>>>>>>>>>>> Marek >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, May 10, 2014 at 6:34 PM, Christian König >>>>>>>>>>>> <deathsimple@xxxxxxxxxxx> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Couldn't reproduce the issue so far. So the attached patch is >>>>>>>>>>>>> just >>>>>>>>>>>>> a >>>>>>>>>>>>> complete shoot into the dark found by rereading the code, but >>>>>>>>>>>>> it >>>>>>>>>>>>> might >>>>>>>>>>>>> actually be the problem. >>>>>>>>>>>>> >>>>>>>>>>>>> Please give it a try. >>>>>>>>>>>>> >>>>>>>>>>>>> Going to keep testing in the meantime, >>>>>>>>>>>>> Christian. >>>>>>>>>>>>> >>>>>>>>>>>>> Am 10.05.2014 10:23, schrieb Christian König: >>>>>>>>>>>>> >>>>>>>>>>>>>>> I see hangs with kernel 3.15 and SI under memory pressure, >>>>>>>>>>>>>>> e.g. >>>>>>>>>>>>>>> if >>>>>>>>>>>>>>> I boot >>>>>>>>>>>>>>> with radeon.vramlimit=256 and then run Xonotic timedemo with >>>>>>>>>>>>>>> high >>>>>>>>>>>>>>> settings. >>>>>>>>>>>>>>> I haven't had a chance to bisect it yet, but it might be a >>>>>>>>>>>>>>> similar >>>>>>>>>>>>>>> problem. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sounds like the same issue to me. Thx for the good test case. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Any idea what is wrong with it? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Actually I already wondered that it went so smooth without any >>>>>>>>>>>>>> regression >>>>>>>>>>>>>> so far, didn't noticed the bug in bugzilla.kernel.org yet. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Some of the tests allocate a lot of MSAA textures and the >>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>> also >>>>>>>>>>>>>>> run in parallel, which creates a lot of memory pressure and >>>>>>>>>>>>>>> probably >>>>>>>>>>>>>>> causes buffer evictions. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sounds like the underlying problem to me. We probably evict >>>>>>>>>>>>>> some >>>>>>>>>>>>>> part of a >>>>>>>>>>>>>> page table without updating the page directory. Going to dig >>>>>>>>>>>>>> into >>>>>>>>>>>>>> it today, >>>>>>>>>>>>>> it's probably just a one liner missing somewhere in the VM >>>>>>>>>>>>>> code. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Christian. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Am 09.05.2014 23:39, schrieb Grigori Goronzy: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 09.05.2014 20:03, Marek Olšák wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This commit which first appeared in 3.15-rc1 causes hangs on >>>>>>>>>>>>>>>> Bonaire: >>>>>>>>>>>>>>>> [...] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The simplest way to reproduce the hangs is to run piglit >>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>> these >>>>>>>>>>>>>>>> parameters: >>>>>>>>>>>>>>>> -t texelFetch.fs >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Some of the tests allocate a lot of MSAA textures and the >>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>> also >>>>>>>>>>>>>>>> run in parallel, which creates a lot of memory pressure and >>>>>>>>>>>>>>>> probably >>>>>>>>>>>>>>>> causes buffer evictions. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I see hangs with kernel 3.15 and SI under memory pressure, >>>>>>>>>>>>>>> e.g. >>>>>>>>>>>>>>> if >>>>>>>>>>>>>>> I boot >>>>>>>>>>>>>>> with radeon.vramlimit=256 and then run Xonotic timedemo with >>>>>>>>>>>>>>> high >>>>>>>>>>>>>>> settings. >>>>>>>>>>>>>>> I haven't had a chance to bisect it yet, but it might be a >>>>>>>>>>>>>>> similar >>>>>>>>>>>>>>> problem. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Grigori >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> dri-devel mailing list >>>>>>>>> dri-devel@xxxxxxxxxxxxxxxxxxxxx >>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> >>> > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel