Forgot the repo: http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=ppgtt On Thu, Jun 27, 2013 at 04:30:01PM -0700, Ben Widawsky wrote: > First, I don't think this whole series is ready for merge yet. It is > however ready for a review, and I think a lot of the prep patches in the > series could be merged to make my rebasing life a bit easier. I cannot > continue ignoring pretty much all emails/bugs as I have for the last > month to wrap up this series. The current state is on my IVB, things are > pretty stable. I've seen one unexplained hang, but I'm hopeful review > might help me uncover/explain. > > This patch series introduces the next step in enabling full PPGTT, which > is per fd address space/context, and it also contains the previously > unmerged patches (some of which have been reworked, modified, or > rebased). In regards to the continued VMA changes, I think in these, the > delta with regard to the last posting is the bound list was per VM. It > is now global. I've also moved the active list to per VM. > > Brand new in the series we take the previous series' context per fd > (with address space) one step further and actually switch the address > spaces when we do context switches. In order to make this happen, the > series continues to chip away at removing the notion of an object only > ever being bound into one address space via the struct > i915_address_space and struct i915_vma data structures which are really > abstractions for a page directory, and current mapped ptes respectively. > The error state is improved since the last series (though still some > work there is probably needed). It also serves to remove the notion of > the aliasing PPGTT since in theory everything bound into the GGTT > shouldn't benefit from an aliasing PPGTT (fact check). > > With every context having it's own address space, and every open DRM fd > having it's own context, it's trivial on execbuf to lookup a context and > do the pinning in the proper address space. More importantly, it's > implicit that a context exists, which made this impossible to do > earlier. > > *A note on patch ordering:* In order to work this series incrementally, the > final patch ordering is admittedly a little bit strange. I'm more than willing > to rework these as requested, but I'd really prefer not to do really heavy > reordering unless there is a major benefit, or of course to fix bugs. > > # What is not in this patch series in the order I think we should handle it > (and I acknowledge some of this stuff is non-trivial): > > ## Review + QA coverage > > ## Porting to HSW > > It shouldn't be too much extra work, if any, to add support. I haven't looked > into it yet. > > ## Better vm/ppgtt info in error state collection > > In particular, I want to dump all the PTEs at hang, and at the very least the > guilt PTEs. This isn't difficult, and can be done with copypasta from the > existing dumper I have. > > ## User space and the implications > > Now that contexts are valid on all rings, userspace should begin emitting the > context for all rings if it expects both rings to be able to access both > objects in the same offset. The mesa change looks to be just one line which > simplies emits the context for batch->is_blit, I doubt libva is using contexts, > and SNA seems not to. The plan to support mesa will be to do the detection in > libdrm, and go ahead with the simple mesa one liner. I've been using the > oneliner if mesa for a while now, but we will need to support old user space in > the kernel. So there might be a bit of work even on the kernel side here. We > also need some IGT tools test updates. I have messy versions of these locally > already. > > ## Performance data > > I think it doesn't preclude preliminary review of the patches since the main > goal of PPGTT is really abourt security, correctness, and enabling other > things. I will update with some numbers after I work on it a bit more. > > > ## Testing on SNB > > If our current code is correct, then I think these patches might work on SNB > as is, but it's untested. There is currently no way to disconnect contexts + > PPGTT from the whole thing; so if this doesn't work - we'll need rework some of > the code. I think it should just entail bringing back aliasing ppgtt, and not > doing the address space switch when switching contexts (aliasing ppgtt will > have a null switch_mm()). > > ## Soft pin interface > > I'd like to defer the discussion until these patches are merged. > > ## On demand page table allocation > > This is a potentially very useful optimization for at least the following > reasons: > * any app using contexts will have an extra set of page tables it isn't using; > wasted memory > * Reduce DCLV to reduce pd fetch latency > * Allows Better use of a switch to default context for low memory situations > (evicting unused page tables for example) > > ## Per VMA cache levels/control > > There are situations in the code where we have to flush the GPU pipeline in > order to change cache levels. This should no longer be the case for unaffected > VMs (I think). The same may be true with domain tracking. > > ## dmabuf/prime integration > > I haven't looked into what's missing to support it. If I'm lucky, it just works. > > > > With that, if you haven't already moved on, chanting tl;dr - all comments > welcome. > > --- > > Ben Widawsky (65): > drm/i915: Remove extra error state NULL > drm/i915: Extract error buffer capture > drm/i915: make PDE|PTE platform specific > drm/i915: Don't clear gtt with 0 entries > drm/i915: Conditionally use guard page based on PPGTT > drm/i915: Use drm_mm for PPGTT PDEs > drm/i915: cleanup context fini > drm/i915: Do a fuller init after reset > drm/i915: Split context enabling from init > drm/i915: destroy i915_gem_init_global_gtt > drm/i915: Embed PPGTT into the context > drm/i915: Unify PPGTT codepaths on gen6+ > drm/i915: Move ppgtt initialization down > drm/i915: Tie context to PPGTT > drm/i915: Really share scratch page > drm/i915: Combine scratch members into a struct > drm/i915: Drop dev from pte_encode > drm/i915: Use gtt shortform where possible > drm/i915: Move fbc members out of line > drm/i915: Move gtt and ppgtt under address space umbrella > drm/i915: Move gtt_mtrr to i915_gtt > drm/i915: Move stolen stuff to i915_gtt > drm/i915: Move aliasing_ppgtt > drm/i915: Put the mm in the parent address space > drm/i915: Move active/inactive lists to new mm > drm/i915: Create a global list of vms > drm/i915: Remove object's gtt_offset > drm: pre allocate node for create_block > drm/i915: Getter/setter for object attributes > drm/i915: Create VMAs (part 1) > drm/i915: Create VMAs (part 2) - kill gtt space > drm/i915: Create VMAs (part 3) - plumbing > drm/i915: Create VMAs (part 3.5) - map and fenceable tracking > drm/i915: Create VMAs (part 4) - Error capture > drm/i915: Create VMAs (part 5) - move mm_list > drm/i915: Create VMAs (part 6) - finish error plumbing > drm/i915: create an object_is_active() > drm/i915: Move active to vma > drm/i915: Track all VMAs per VM > drm/i915: Defer request freeing > drm/i915: Clean up VMAs before freeing > drm/i915: Replace has_bsd/blt with a mask > drm/i915: Catch missed context unref earlier > drm/i915: Add a context open function > drm/i915: Permit contexts on all rings > drm/i915: Fix context fini refcounts > drm/i915: Better reset handling for contexts > drm/i915: Create a per file_priv default context > drm/i915: Remove ring specificity from contexts > drm/i915: Track which ring a context ran on > drm/i915: dump error state based on capture > drm/i915: PPGTT should take a ppgtt argument > drm/i915: USE LRI for switching PP_DIR_BASE > drm/i915: Extract mm switching to function > drm/i915: Write PDEs at init instead of enable > drm/i915: Disallow pin with full ppgtt > drm/i915: Get context early in execbuf > drm/i915: Pass ctx directly to switch/hangstat > drm/i915: Actually add the new address spaces > drm/i915: Use multiple VMs > drm/i915: Kill now unused ppgtt_{un,}bind > drm/i915: Add PPGTT dumper > drm/i915: Dump all ppgtt > drm/i915: Add debugfs for vma info per vm > drm/i915: Getparam full ppgtt > > Chris Wilson (1): > drm: Optionally create mm blocks from top-to-bottom > > drivers/gpu/drm/drm_mm.c | 134 +++--- > drivers/gpu/drm/i915/i915_debugfs.c | 215 ++++++++-- > drivers/gpu/drm/i915/i915_dma.c | 25 +- > drivers/gpu/drm/i915/i915_drv.c | 57 ++- > drivers/gpu/drm/i915/i915_drv.h | 353 ++++++++++------ > drivers/gpu/drm/i915/i915_gem.c | 639 +++++++++++++++++++++-------- > drivers/gpu/drm/i915/i915_gem_context.c | 279 +++++++++---- > drivers/gpu/drm/i915/i915_gem_debug.c | 11 +- > drivers/gpu/drm/i915/i915_gem_evict.c | 64 +-- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 138 ++++--- > drivers/gpu/drm/i915/i915_gem_gtt.c | 541 ++++++++++++++---------- > drivers/gpu/drm/i915/i915_gem_stolen.c | 87 ++-- > drivers/gpu/drm/i915/i915_gem_tiling.c | 21 +- > drivers/gpu/drm/i915/i915_irq.c | 197 ++++++--- > drivers/gpu/drm/i915/i915_trace.h | 38 +- > drivers/gpu/drm/i915/intel_display.c | 40 +- > drivers/gpu/drm/i915/intel_drv.h | 7 - > drivers/gpu/drm/i915/intel_fb.c | 8 +- > drivers/gpu/drm/i915/intel_overlay.c | 26 +- > drivers/gpu/drm/i915/intel_pm.c | 65 +-- > drivers/gpu/drm/i915/intel_ringbuffer.c | 32 +- > drivers/gpu/drm/i915/intel_sprite.c | 8 +- > include/drm/drm_mm.h | 147 ++++--- > include/uapi/drm/i915_drm.h | 1 + > 24 files changed, 2044 insertions(+), 1089 deletions(-) > > -- > 1.8.3.1 > -- Ben Widawsky, Intel Open Source Technology Center