On Thu, Aug 18, 2016 at 4:23 AM, Michel Dänzer <michel@xxxxxxxxxxx> wrote: > Maybe the rasterization as two triangles results in bad PCIe bandwidth > utilization. Using the asynchronous DMA engine for these transfers would > probably be ideal, but having the 3D engine rasterize a single rectangle > (either using the rectangle primitive or a large triangle with scissor) > might already help. There is only one thing that's bad for PCIe when the surface is linear: the 3D engine. Disabling all but the first shader engine and all but the first 2 RBs should improve performance for blits from VRAM to GTT. The closed driver does that, but I don't remember if the destination must be linear, must be in GTT, or both. In any case, SDMA should still be the best for VRAM->GTT blits. Marek _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel