On Thu, 2015-10-08 at 16:57 +0100, Chris Wilson wrote: > Correct me if I am wrong, but it looks like i915_svm implements the > lowlevel interface with the hardware, so by convention is intel_svm.c I think the plan is to drop the driver-mode interface altogether and use the OS-mode version that I posted a few hours ago. So some of your review may not apply; other parts might. > > + struct task_struct *tsk; > > One task? A context can be passed by the device fd to another process. > Do we inherit the VM along with the context? I don't anything to prevent > such. You still get to deal with this, and yes — the assigned PASID value (given by the intel_svm_bind_mm() call) will continue to refer to the VM of the process which called intel_svm_bind_mm(). Even when it's used from elsewhere. Note that you *could* bind the PASID at the time you actually submit the request, and in that case you could bind it to the VM of the process which is submitting the request, rather than the process which did the initial setup. > > +static void gpu_mm_segv(struct task_struct *tsk, unsigned long > > address, > > + int si_code) > > +{ > > + siginfo_t info; > > + > > + /* Need specific signal info here */ > > + info.si_signo = SIGSEGV; > > + info.si_errno = EIO; > > + info.si_code = si_code; > > + info.si_addr = (void __user *)address; > > + > > + force_sig_info(SIGSEGV, &info, tsk); > > force_sig_info() is not exported, ah you builtin i915-svm.c I'm leaving this with you too, as discussed in a separate thread. The OS-mode IOMMU support is just returning an appropriate 'failed' response code to the page request, and letting the endpoint device handle that as it sees fit. > > + spin_lock(&dev_priv->svm.lock); > > + ctx = dev_priv->svm.pasid_ctx[desc.pasid]; > > + tsk = ctx->tsk; > > + mm = tsk->mm; > > + address = desc.addr << PAGE_SHIFT; > > + ringbuf = ctx->engine[RCS].ringbuf; > > + spin_unlock(&dev_priv->svm.lock); > > All of the above can disappear at anytime after the unlock? Answering for my own code... I don't touch any of the gfx context stuff, obviously, and I don't even use the task. I do have a reference on the mm and it isn't going away. > > + > > + down_read_trylock(&mm->mmap_sem); > > + vma = find_extend_vma(mm, address); > > + if (!vma || address < vma->vm_start) { > > + DRM_ERROR("bad VMA or address out of range\n"); Um... Jesse, I used that same 'address < vma->vm_start' check in my own version. But is that going to prevent us from allowing the GPU to grow the stack VMA downwards? I note the AMD one does it too. > > +/* Make sure GPU writes can't hit the mm that's about to go away > > */ > > +static void intel_mm_release(struct mmu_notifier *mn, struct > > mm_struct *mm) > > +{ > > + struct intel_mm_struct *ims = container_of(mn, struct > > intel_mm_struct, > > + notifier); > > + struct drm_i915_private *dev_priv = ims->dev_priv; > > + struct drm_device *dev = dev_priv->dev; > > + struct intel_context *ctx; > > + > > +> > > > /* > > +> > > > * Wait for any outstanding activity and unbind the mm. Since > > +> > > > * each context has its own ring, we can simply wait for the ring > > +> > > > * to idle before invalidating the PASID and flushing the TLB. > > +> > > > */ > > + mutex_lock(&dev->struct_mutex); > > + list_for_each_entry(ctx, &ims->context_list, mm_list) { > > + intel_ring_idle(ctx->engine[RCS].ringbuf->ring); > > + } > > + > > + intel_iommu_tlb_flush(dev_priv->dev); > > + mutex_unlock(&dev->struct_mutex); > > Erm, what! So you halt the GPU everytime? But you've already > invalidated the shadow PTE -- ah, invalidate-range looks to be a wip. That's the callback for when the MM goes away — on process exit, or when we unbind the PASID. But it's still suboptimal, and I am strongly resisting the temptation to have a callback from the OS-mode code into the device driver in this code path. For the normal case of unbinding the PASID cleanly, device drivers are expected to flush their queue of requests for a given PASID *before* calling intel_svm_unbind_pasid(). That includes contexts which might be stalled, waiting for a page request. So you're expected to do any ring management in advance. If a process exits uncleanly, exit_mm() happens before exit_files(). So the MM has gone before the i915 driver gets to clean up. In that case, we end up in the intel_mm_release() function and it just clears the page tables and flushes the IOTLB. Accesses will take faults, and the proper cleanup will happen shortly. > > +static void intel_change_pte(struct mmu_notifier *mn, struct mm_struct *mm, > > +> > > > > > > > unsigned long address, pte_t pte) > > +{ > > +> > > > struct intel_mm_struct *ims = container_of(mn, struct intel_mm_struct, > > +> > > > > > > > > > > > > > notifier); > > +> > > > struct drm_i915_private *dev_priv = ims->dev_priv; > > +> > > > struct drm_device *dev = dev_priv->dev; > > + > > +> > > > struct intel_context *ctx; > > + > > +> > > > mutex_lock(&dev->struct_mutex); > > +> > > > list_for_each_entry(ctx, &ims->context_list, mm_list) > > +> > > > > > intel_flush_page_locked(dev, ctx->pasid, address); > > +> > > > mutex_unlock(&dev->struct_mutex); > > Suggests you really want a ims->spinlock for context_list instead. We now use a single PASID for all contexts, so we don't need any list traversal like this anyway. > > +struct intel_mm_struct *intel_bind_mm(struct drm_device *dev, > > + struct intel_context *ctx) > > +{ > > + struct drm_i915_private *dev_priv = dev->dev_private; > > + struct intel_mm_struct *ims; > > + struct mmu_notifier *mn; > > + int ret; > > + > > + WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex)); > > + > > + mn = mmu_find_ops(current->mm, &intel_mmuops); > > Magic function, I am missing its definition A previous patch in the series exports this. It basically just checks if there is *already* a mmu_notifier registered with the same mmu_notifier_ops structure, and returns it if so. Because of the hairy locking rules around the mmu_notifier, I've done this differently in my own code — using idr_for_each_entry() on the IDR I use to allocate PASIDs. That's fine while there is a relatively low number of PASIDs allocated (i.e. a low number of *processes* using PASIDs, since there's one-PASID-per-process). But if we end up with lots, that idr_for_each_entry() might become burdensome and we might need to take another look. > > + if (mn) { > > + ims = container_of(mn, struct intel_mm_struct, > > notifier); > > + kref_get(&ims->kref); > > + goto out; > > + } > > + > > + ims = kzalloc(sizeof(*ims), GFP_KERNEL); > > + if (!ims) { > > + ret = -ENOMEM; > > + goto error; > > + } > > + INIT_LIST_HEAD(&ims->context_list); > > + > > + ims->notifier.ops = &intel_mmuops; > > + > > + ret = mmu_notifier_register(&ims->notifier, current->mm); > > This has lock inversion between struct_mutex and mm->mmap_sem. I think this is gone, because I no longer take my equivalent pasid_mutex within the mmu_notifier callbacks. Unless I'm missing something. -- David Woodhouse Open Source Technology Centre David.Woodhouse@xxxxxxxxx Intel Corporation
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx