On Mon, Apr 08, 2013 at 07:18:11PM +0200, Daniel Vetter wrote: > On Tue, Feb 12, 2013 at 02:17:22PM +0000, Chris Wilson wrote: > > By exporting the ability to map user address and inserting PTEs > > representing their backing pages into the GTT, we can exploit UMA in order > > to utilize normal application data as a texture source or even as a > > render target (depending upon the capabilities of the chipset). This has > > a number of uses, with zero-copy downloads to the GPU and efficient > > readback making the intermixed streaming of CPU and GPU operations > > fairly efficient. This ability has many widespread implications from > > faster rendering of client-side software rasterisers (chromium), > > mitigation of stalls due to read back (firefox) and to faster pipelining > > of texture data (such as pixel buffer objects in GL or data blobs in CL). > > > > v2: Compile with CONFIG_MMU_NOTIFIER > > v3: We can sleep while performing invalidate-range, which we can utilise > > to drop our page references prior to the kernel manipulating the vma > > (for either discard or cloning) and so protect normal users. > > v4: Only run the invalidate notifier if the range intercepts the bo. > > v5: Prevent userspace from attempting to GTT mmap non-page aligned buffers > > > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> > > Quick bikeshed: > - Still not really in favour of the in-page gtt offset handling ... I > still think that this is just a fancy mmap api, and it better reject > attempts to not map anything aligned to a full page outright. Strongly disagree. > - I915_USERPTR_UNSYNCHRONIZED ... eeek. That means that despite everyone > having mmu notifiers enabled in their distro config, you make sure sna > doesn't hit it. Imo not enough testing coverage ;-) Or this there > another reason behind this than "mmu notifiers are too slow"? > > Generally I'm a bit sloppy with going root-only for legacy X stuff (like > scanline waits), but this here looks very much generally useful. So not > exemption-material imo. Strongly disagree. Most of my machines do not have mmu-notifiers and would still like to benefit from userptr and I see no reason why we need to force mmu-notifiers. > - On a quick read I've seen some gtt mmap support remnants. This scares > me, a roundabout njet! would appease. Though I think that should already > happen with the checks we have to reject snoopable buffers? That's because there are platforms where it is theorectically possible and whilst I have no use case for it, I wanted to make it work nevertheless. I still think it is possible, but I could not see a way to do so without completely replacing the drm vm code. > - I think we should reject set_cacheing requests on usersptr objects. I don't think that is strictly required, just like we should not limit the user from using set_tiling. (Though the user is never actually going to tell the kernel about such tiling...) > - union drm_i915_gem_objects freaked me out shortly, until I've noticed > that it's only used for our private slab. Maybe and explicit max in > there? Also, this somewhat defeats that you've moved the userptr stuff > out of the base class - this way we won't save any memory ... The alternative is to use a union inside the object. Long ago, I had a few more objects in there. > - Basic igt to check the above api corner-cases return -EINVAL would be > nice. Been sitting around for ages, just waiting for the interface to be agreed upon. > - I need to check for deadlocks around the mmu notifier handling. Iirc > that thing takes all mm locks, and our own bo unref code can be called > from all kinds of interesting places. Since each vma object also holds a > ref onto a gem bo I suspect that we do have some fun all here ... The notifier itself takes no locks, so the locking is whatever state the caller sets up. The current locking order is (mm, struct mutex) i.e. the same ordering as used the notifier callbacks. -Chris -- Chris Wilson, Intel Open Source Technology Centre