In order to try and improve X(Shm)PutImage performance with glamor, I implemented support for write-combined CPU mappings of BOs in GTT. This did provide a nice speedup, but to my surprise, using VRAM instead of write-combined GTT turned out to be even faster in general on my Kaveri machine, both for the internal GPU and for discrete GPUs. However, I've kept the changes from GTT to VRAM separated, in case this turns out to be a loss on other setups. Kernel patches: [PATCH 1/5] drm/radeon: Remove radeon_gart_restore() [PATCH 2/5] drm/radeon: Pass GART page flags to [PATCH 3/5] drm/radeon: Allow write-combined CPU mappings of BOs in [PATCH 4/5] drm/radeon: Use write-combined CPU mappings of rings and [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI Mesa patches: [PATCH 1/5] winsys/radeon: Use separate caching buffer managers for [PATCH 2/5] r600g/radeonsi: Use write-combined CPU mappings of some [PATCH 3/5] r600g/radeonsi: Prefer VRAM for CPU -> GPU streaming [PATCH 4/5] r600g,radeonsi: Use write-combined persistent GTT [PATCH 5/5] r600g,radeonsi: Prefer VRAM for persistent mappings _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel