Re: [PATCH v3 3/3] drm/i915: Use a task to cancel the userptr on invalidate_range

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Wed, 9 Sep 2015 16:08:00 +0100



On Wed, Sep 09, 2015 at 03:45:40PM +0100, Tvrtko Ursulin wrote:
> On 08/10/2015 09:51 AM, Chris Wilson wrote:
> >Whilst discussing possible ways to trigger an invalidate_range on a
> >userptr with an aliased GGTT mmapping (and so cause a struct_mutex
> >deadlock), the conclusion is that we can, and we must, prevent any
> >possible deadlock by avoiding taking the mutex at all during
> >invalidate_range. This has numerous advantages all of which stem from
> >avoid the sleeping function from inside the unknown context. In
> >particular, it simplifies the invalidate_range because we no longer
> >have to juggle the spinlock/mutex and can just hold the spinlock
> >for the entire walk. To compensate, we have to make get_pages a bit more
> >complicated in order to serialise with a pending cancel_userptr worker.
> >As we hold the struct_mutex, we have no choice but to return EAGAIN and
> >hope that the worker is then flushed before we retry after reacquiring
> >the struct_mutex.
> >
> >The important caveat is that the invalidate_range itself is no longer
> >synchronous. There exists a small but definite period in time in which
> >the old PTE's page remain accessible via the GPU. Note however that the
> >physical pages themselves are not invalidated by the mmu_notifier, just
> >the CPU view of the address space. The impact should be limited to a
> >delay in pages being flushed, rather than a possibility of writing to
> >the wrong pages. The only race condition that this worsens is remapping
> >an userptr active on the GPU where fresh work may still reference the
> >old pages due to struct_mutex contention. Given that userspace is racing
> >with the GPU, it is fair to say that the results are undefined.
> >
> >v2: Only queue (and importantly only take one refcnt) the worker once.
> 
> This one I looked at at the time of previous posting and it looked
> fine, minus one wrong line of thinking of mine. On a brief look it
> still looks good, so:
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> 
> I assume Michał has run all these through the relevant test cases?
> 
> Slightly related, I now worry about the WARN_ONs in
> __cancel_userptr__worker since they look to be triggerable by
> malicious userspace which is not good.

They could always be I thought, if you could somehow pin the userptr
into a hardware register and then unmap the vma. That is a scary thought
and one I would like a WARN for. That should be the only way, and I shudder
at the prospect of working out who to send the SIGBUS to.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx