On 17.07.2014 19:09, Christian König wrote: > Am 17.07.2014 12:01, schrieb Michel Dänzer: >> In order to try and improve X(Shm)PutImage performance with glamor, I >> implemented support for write-combined CPU mappings of BOs in GTT. >> >> This did provide a nice speedup, but to my surprise, using VRAM instead >> of write-combined GTT turned out to be even faster in general on my >> Kaveri machine, both for the internal GPU and for discrete GPUs. >> >> However, I've kept the changes from GTT to VRAM separated, in case this >> turns out to be a loss on other setups. >> >> Kernel patches: >> >> [PATCH 1/5] drm/radeon: Remove radeon_gart_restore() >> [PATCH 2/5] drm/radeon: Pass GART page flags to >> [PATCH 3/5] drm/radeon: Allow write-combined CPU mappings of BOs in >> [PATCH 4/5] drm/radeon: Use write-combined CPU mappings of rings and > > Those four are Reviewed-by: Christian König <christian.koenig@xxxxxxx> Thanks! >> [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI > > I'm still not very keen with this change since I still don't understand > the reason why it's faster than with GTT. Definitely needs more testing > on a wider range of systems. Sure. If anyone wants to give this patch a spin and see if they can measure any performance difference, good or bad, that would be interesting. > Maybe limit it to APUs for now? But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an even bigger win with dedicated GPUs than with the Kaveri built-in GPU on my system. I suspect it may depend on the bandwidth available for PCIe vs. system memory though. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel