On Thu, Mar 21, 2019 at 07:51:16PM +0000, Thomas Hellstrom wrote: > Hi, Jérôme, > > Thanks for commenting. I have a couple of questions / clarifications > below. > > On Thu, 2019-03-21 at 09:46 -0400, Jerome Glisse wrote: > > On Thu, Mar 21, 2019 at 01:22:22PM +0000, Thomas Hellstrom wrote: > > > Resending since last series was sent through a mis-configured SMTP > > > server. > > > > > > Hi, > > > This is an early RFC to make sure I don't go too far in the wrong > > > direction. > > > > > > Non-coherent GPUs that can't directly see contents in CPU-visible > > > memory, > > > like VMWare's SVGA device, run into trouble when trying to > > > implement > > > coherent memory requirements of modern graphics APIs. Examples are > > > Vulkan and OpenGL 4.4's ARB_buffer_storage. > > > > > > To remedy, we need to emulate coherent memory. Typically when it's > > > detected > > > that a buffer object is about to be accessed by the GPU, we need to > > > gather the ranges that have been dirtied by the CPU since the last > > > operation, > > > apply an operation to make the content visible to the GPU and clear > > > the > > > the dirty tracking. > > > > > > Depending on the size of the buffer object and the access pattern > > > there are > > > two major possibilities: > > > > > > 1) Use page_mkwrite() and pfn_mkwrite(). (GPU buffer objects are > > > backed > > > either by PCI device memory or by driver-alloced pages). > > > The dirty-tracking needs to be reset by write-protecting the > > > affected ptes > > > and flush tlb. This has a complexity of O(num_dirty_pages), but the > > > write page-fault is of course costly. > > > > > > 2) Use hardware dirty-flags in the ptes. The dirty-tracking needs > > > to be reset > > > by clearing the dirty bits and flush tlb. This has a complexity of > > > O(num_buffer_object_pages) and dirty bits need to be scanned in > > > full before > > > each gpu-access. > > > > > > So in practice the two methods need to be interleaved for best > > > performance. > > > > > > So to facilitate this, I propose two new helpers, > > > apply_as_wrprotect() and > > > apply_as_clean() ("as" stands for address-space) both inspired by > > > unmap_mapping_range(). Users of these helpers are in the making, > > > but needs > > > some cleaning-up. > > > > To be clear this should _only be use_ for mmap of device file ? If so > > the API should try to enforce that as much as possible for instance > > by > > mandating the file as argument so that the function can check it is > > only use in that case. Also big scary comment to make sure no one > > just > > start using those outside this very limited frame. > > Fine with me. Perhaps we could BUG() / WARN() on certain VMA flags > instead of mandating the file as argument. That can make sure we > don't accidently hit pages we shouldn't hit. You already provide the mapping as argument it should not be hard to check it is a mapping to a device file as the vma flags will not be enough to identify this case. > > > > > > There's also a change to x_mkwrite() to allow dropping the mmap_sem > > > while > > > waiting. > > > > This will most likely conflict with userfaultfd write protection. > > Are you referring to the x_mkwrite() usage itself or the mmap_sem > dropping facilitation? Both i believe, however i have not try to apply your patches on top of the userfaultfd patchset Cheers, Jérôme