On Fri, Jul 18, 2014 at 5:47 PM, Christian König <deathsimple@xxxxxxxxxxx> wrote: > Am 18.07.2014 05:07, schrieb Michel Dänzer: >>>> >>>> [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI >>> >>> I'm still not very keen with this change since I still don't understand >>> the reason why it's faster than with GTT. Definitely needs more testing >>> on a wider range of systems. >> >> Sure. If anyone wants to give this patch a spin and see if they can >> measure any performance difference, good or bad, that would be >> interesting. >> >>> Maybe limit it to APUs for now? >> >> But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an even >> bigger win with dedicated GPUs than with the Kaveri built-in GPU on my >> system. I suspect it may depend on the bandwidth available for PCIe vs. >> system memory though. > > > I've made a few tests today with the kernel part of the patches running > Xonotic on Ultra in 1920 x 1080. > > Without any patches I get around ~47.0fps on average with my dedicated > HD7870. > > Adding only "drm/radeon: Use write-combined CPU mappings of rings and IBs on >>= SI" and that goes down to ~45.3fps. > > Adding on to off that "drm/radeon: Use VRAM for indirect buffers on >= SI" > and the frame rate goes down to ~27.74fps. > > So enabling this unconditionally is definitely not a good idea. What I don't > understand yet is why using USWC reduces the fps on SI as well. It looks > like the reads from the IB buffer for command stream validation on SI affect > that more than thought. Yes, there is a CS parser with SI, but shouldn't the parser read from the CPU copy that came with the ioctl instead? Anyway, I recommend only using VRAM for IBs which are not parsed and patched by the CPU (which reduces it down to CIK graphics and DMA IBs, right?) Marek _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel