On Sat, May 10, 2014 at 5:58 AM, Ben Widawsky <benjamin.widawsky@xxxxxxxxx> wrote: > Just as before, these patches are living based off of my Broadwell > branch, here: > http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=gpu_mirror > > This is the follow-on patches for [1] > > This patch series brings 3 things: > 1. Dynamic page table allocation for gen6-8 > 2. 64b (48b canonical) graphics virtual address space for Broadwell > 3. An interface to specify a specific offset for a BO. > > It's taken way longer than I thought to get this work done, and given > the current state of our driver, I fear I may not have time to see this > through to the end before I am pulled onto other things. If people want > to send me smallish bugfixes, I will gladly do my best to fix them > quickly. If there are more substantial change requests wrt design or > patch reorganization, I will not be able to accommodate. Someone else > must take over this patch series at that point if they want these > features. I do believe that everything up until the userptr patch is in > decent shape though, so we'll see, I guess. (if you are qualified to > take this over, and have interest, please let me know). > > The patch series is highly volatile and not manicured. I've run exactly > 1 test on the GPU mirror (see below for what that means), though many > more on the prior stuff. The series depends on full PPGTT, which is not > yet enabled by default, and has a few outstanding issues. It also has > been developed exclusively on pre-production hardware. I am only sending > out now because I will be on vacation for the next 10 days, and I know > there are people that can benefit from this code before I return. With > that, I got the last parts of this working very recently, and they're > very hackish. The reason for this lack of refinement is I expect the > interfaces for letting userspace dictate things to change (more on this > later), and the other part is I just ran out of time before my vacation. > Throughout development, I've been hitting issues which I am not yet sure > if they are bugs in my code, bugs in full PPGTT, bugs in userptr, or > generally flakiness. There are a few patches in here which say TESTME > reflecting upon this. Also, if you want to run this, I highly recommend > turning off semaphores, and rc6. (To be honest, I've not tried it > recently). You also need to turn on PPGTT since it is disabled by > default. > > modprobe i915 enable_ppgtt=2 semaphores=0 enable_rc6=0 > > What you get in this series is what I'm going to coin, GPU mirror. This > patch series allows one to allocate an arbitrary address for your GPU > buffer object, and map it to a specific space within the GPUs address > space. This is only possible because on Broadwell we get a 64b canonical > GPU address space, and this allows us to map any CPU address as a GPU > address. The obvious usage here is malloc(). malloc() returns a pointer > that is valid on the CPU. Now that address can be identical on the GPU. > > The interface provided is identical to the userptr interface previously > posted by Chris Wilson. I've added a flag to that interface that > indicates this new functionality. This is not necessarily the final > version, and it's arguably not the best idea either. The reason for this > choice is we had users of userptr that wanted to try out this concept > and not have to do much porting. > > To get to the userptr interface, I had to make a few things happen > first. I needed to get dynamic page table allocation and teardown > working. This was posted previously for gen6-7 [1] (with very rough code > for gen8). I've now added more robust support for gen8 dynamic page > table allocations. Doing the allocations dynamically was important > because preallocating all 4 levels of page tables is not feasible in a > real system. 4 level page tables are required in order to be able to > support the 64b canonical address space. > > With that all done, I was able to make a few minor hacks to userptr, > take the intel-gpu-tools test from Tvrtko, and see at least one pass. > FWIW, I am currently running, > ./tests/gem_userptr_blits --run-subtest coherency-unsync > > Since I feel the interface will likely change, I do not feel compelled > to post either my libdrm, not my IGT changes. If you want the modified > test, let me know, as I don't think it's really relevant here. > > One last thing. Intel GPU tools, as it stands today, makes a lot of > assumptions about using an address space > 32b. I have not had time to > fix this. It is something which needs fixing before this series could > even be considered testable. Until full ppgtt is fixed up and enabled by default it in my opinion doesn't make sense to pile more ppgtt features on top. Until that's addressed or someone convinces me that my opinion is stupid I'll reject this. Of course that doesn't include the entire patch series, since the dynamic ppgtt pte stuff is part of fixing up ppgtt. But I still think we should address the bugs first before we make the code even more complicated, so I'd prefer to merge even the dynamic pte stuff after full ppgtt is again enabled by default. -Daniel > > [1] http://lists.freedesktop.org/archives/intel-gfx/2014-March/041814.html > > Ben Widawsky (54): > drm/i915: Fix flush before context switch comment > Revert "drm/i915: Drop I915_PARAM_HAS_FULL_PPGTT again" > drm/i915: Wrap VMA binding > drm/i915: Make pin global flags explicit > drm/i915: Split out aliasing binds > drm/i915: fix gtt_total_entries() > drm/i915: Rename to GEN8_LEGACY_PDPES > drm/i915: Split out verbose PPGTT dumping > drm/i915: s/pd/pdpe, s/pt/pde > drm/i915: rename map/unmap to dma_map/unmap > drm/i915: Setup less PPGTT on failed pagedir > drm/i915: clean up PPGTT init error path > drm/i915: Un-hardcode number of page directories > drm/i915: Make gen6_write_pdes gen6_map_page_tables > drm/i915: Range clearing is PPGTT agnostic > drm/i915: Page table helpers, and define renames > drm/i915: construct page table abstractions > drm/i915: Complete page table structures > drm/i915: Create page table allocators > drm/i915: Generalize GEN6 mapping > drm/i915: Clean up pagetable DMA map & unmap > drm/i915: Always dma map page table allocations > drm/i915: Consolidate dma mappings > drm/i915: Always dma map page directory allocations > drm/i915: Track GEN6 page table usage > drm/i915: Extract context switch skip logic > drm/i915: Force pd restore when PDEs change, gen6-7 > drm/i915: Finish gen6/7 dynamic page table allocation > drm/i915/bdw: Use dynamic allocation idioms on free > drm/i915/bdw: pagedirs rework allocation > drm/i915/bdw: pagetable allocation rework > drm/i915/bdw: Make the pdp switch a bit less hacky > drm/i915: num_pd_pages/num_pd_entries isn't useful > drm/i915: Extract PPGTT param from pagedir alloc > drm/i915/bdw: Split out mappings > drm/i915/bdw: begin bitmap tracking > drm/i915/bdw: Dynamic page table allocations > drm/i915/bdw: Scratch unused pages > drm/i915/bdw: Add ppgtt info for dynamic pages > drm/i915/bdw: Optimize PDP loads > TESTME: Either drop the last patch or fix it. > drm/i915/bdw: Add dynamic page trace events > drm/i915/bdw: Make pdp allocation more dynamic > drm/i915/bdw: Abstract PDP usage > drm/i915/bdw: implement alloc/teardown for 4lvl > drm/i915/bdw: 4 level pages tables > drm/i915: Restructure map vs. insert entries > drm/i915/bdw: make aliasing PPGTT dynamic > drm/i915: Expand error state's address width to 64b > drm/i915/bdw: Flip the 48b switch > TESTME: GFX_TLB_INVALIDATE_EXPLICIT > TESTME: Always force invalidate > drm/i915: Track userptr VMAs > drm/i915/userptr: Mirror GPU addr at ioctl (HACK/POC) > > Chris Wilson (2): > drm/i915: Prevent signals from interrupting close() > drm/i915: Introduce mapping of user pages into video memory (userptr) > ioctl > > drivers/gpu/drm/i915/Kconfig | 1 + > drivers/gpu/drm/i915/Makefile | 1 + > drivers/gpu/drm/i915/i915_debugfs.c | 112 +- > drivers/gpu/drm/i915/i915_dma.c | 15 +- > drivers/gpu/drm/i915/i915_drv.h | 40 +- > drivers/gpu/drm/i915/i915_gem.c | 61 +- > drivers/gpu/drm/i915/i915_gem_context.c | 31 +- > drivers/gpu/drm/i915/i915_gem_dmabuf.c | 5 + > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 22 +- > drivers/gpu/drm/i915/i915_gem_gtt.c | 1810 +++++++++++++++++++++------- > drivers/gpu/drm/i915/i915_gem_gtt.h | 354 +++++- > drivers/gpu/drm/i915/i915_gem_userptr.c | 767 ++++++++++++ > drivers/gpu/drm/i915/i915_gpu_error.c | 21 +- > drivers/gpu/drm/i915/i915_reg.h | 1 + > drivers/gpu/drm/i915/i915_trace.h | 140 +++ > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- > include/uapi/drm/i915_drm.h | 20 + > 17 files changed, 2823 insertions(+), 580 deletions(-) > create mode 100644 drivers/gpu/drm/i915/i915_gem_userptr.c > > -- > 1.9.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx