The purpose of this series is to remove the often full-screen vmwgfx sequence vmap() blit() vunmap() and replace it with kmap_atomic() style per-page maps. Although somewhat rare nowadays, 32-bit VMs restrict the vmap space so that huge vmaps may sometimes fail. Also, large vmaps lead to frequent global TLB flushes. On the contrary, using kmap_atomic() makes the blit code a lot more complex, but hopefully we shouldn't need to revisit that code. On 64-bit architectures, kmap_atomic() is essentially free as long as the mapping is cached, since then the linear kernel map is reused. On 32-bit architectures the same holds for lowmem pages, and on highmem pages the TLB flushes are local and per-page. Preliminary findings on 32-bit vms show that while the blit itself consumes more CPU with the new approach, the system idle time is about the same or slightly increased. The cpu blit was originally written as a TTM utility, for use also with CPU buffer object moves, but until that code is written and tested we're keeping it in vmwgfx. In addition we've added a possibility to compute the diff bounding box between the source and the destinations. The overhead of computing that bounding-box is not that big when integrated with a CPU blit that still has to be performed. And it comes in very handy for remoting of 2D-only VMs where the damage rects are broken (Ubuntu / compiz) or for page-flipping (gnome-shell). Note that a slightly modified version of patch 3/5 is already upstream in 4.15. This series is against drm-next and it's included for completeness. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel