Hi, Jérôme, Thanks for commenting. I have a couple of questions / clarifications below. On Thu, 2019-03-21 at 09:46 -0400, Jerome Glisse wrote: > On Thu, Mar 21, 2019 at 01:22:22PM +0000, Thomas Hellstrom wrote: > > Resending since last series was sent through a mis-configured SMTP > > server. > > > > Hi, > > This is an early RFC to make sure I don't go too far in the wrong > > direction. > > > > Non-coherent GPUs that can't directly see contents in CPU-visible > > memory, > > like VMWare's SVGA device, run into trouble when trying to > > implement > > coherent memory requirements of modern graphics APIs. Examples are > > Vulkan and OpenGL 4.4's ARB_buffer_storage. > > > > To remedy, we need to emulate coherent memory. Typically when it's > > detected > > that a buffer object is about to be accessed by the GPU, we need to > > gather the ranges that have been dirtied by the CPU since the last > > operation, > > apply an operation to make the content visible to the GPU and clear > > the > > the dirty tracking. > > > > Depending on the size of the buffer object and the access pattern > > there are > > two major possibilities: > > > > 1) Use page_mkwrite() and pfn_mkwrite(). (GPU buffer objects are > > backed > > either by PCI device memory or by driver-alloced pages). > > The dirty-tracking needs to be reset by write-protecting the > > affected ptes > > and flush tlb. This has a complexity of O(num_dirty_pages), but the > > write page-fault is of course costly. > > > > 2) Use hardware dirty-flags in the ptes. The dirty-tracking needs > > to be reset > > by clearing the dirty bits and flush tlb. This has a complexity of > > O(num_buffer_object_pages) and dirty bits need to be scanned in > > full before > > each gpu-access. > > > > So in practice the two methods need to be interleaved for best > > performance. > > > > So to facilitate this, I propose two new helpers, > > apply_as_wrprotect() and > > apply_as_clean() ("as" stands for address-space) both inspired by > > unmap_mapping_range(). Users of these helpers are in the making, > > but needs > > some cleaning-up. > > To be clear this should _only be use_ for mmap of device file ? If so > the API should try to enforce that as much as possible for instance > by > mandating the file as argument so that the function can check it is > only use in that case. Also big scary comment to make sure no one > just > start using those outside this very limited frame. Fine with me. Perhaps we could BUG() / WARN() on certain VMA flags instead of mandating the file as argument. That can make sure we don't accidently hit pages we shouldn't hit. > > > There's also a change to x_mkwrite() to allow dropping the mmap_sem > > while > > waiting. > > This will most likely conflict with userfaultfd write protection. Are you referring to the x_mkwrite() usage itself or the mmap_sem dropping facilitation? > Maybe > building your thing on top of that would be better. > > ... > > I will take a cursory look at the patches. > Some more questions / clarifications on those as well. > Cheers, > Jérôme