>-----Original Message----- >From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >Sent: Friday, November 22, 2019 3:50 PM >To: Ruhl, Michael J <michael.j.ruhl@xxxxxxxxx>; intel- >gfx@xxxxxxxxxxxxxxxxxxxxx >Subject: RE: [CI] drm/i915: Use a ctor for TYPESAFE_BY_RCU >i915_request > >Quoting Ruhl, Michael J (2019-11-22 20:37:59) >> >-----Original Message----- >> >From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of >Chris >> >Wilson >> >Sent: Friday, November 22, 2019 2:03 AM >> >To: intel-gfx@xxxxxxxxxxxxxxxxxxxxx >> >Subject: [CI] drm/i915: Use a ctor for TYPESAFE_BY_RCU >> >i915_request >> > >> >As we start peeking into requests for longer and longer, e.g. >> >incorporating use of spinlocks when only protected by an >> >rcu_read_lock(), we need to be careful in how we reset the request when >> >recycling and need to preserve any barriers that may still be in use as >> >the request is reset for reuse. >> > >> >Quoting Linus Torvalds: >> > >> >> If there is refcounting going on then why use SLAB_TYPESAFE_BY_RCU? >> > >> > .. because the object can be accessed (by RCU) after the refcount has >> > gone down to zero, and the thing has been released. >> > >> > That's the whole and only point of SLAB_TYPESAFE_BY_RCU. >> > >> > That flag basically says: >> > >> > "I may end up accessing this object *after* it has been free'd, >> > because there may be RCU lookups in flight" >> > >> > This has nothing to do with constructors. It's ok if the object gets >> > reused as an object of the same type and does *not* get >> > re-initialized, because we're perfectly fine seeing old stale data. >> > >> > What it guarantees is that the slab isn't shared with any other kind >> > of object, _and_ that the underlying pages are free'd after an RCU >> > quiescent period (so the pages aren't shared with another kind of >> > object either during an RCU walk). >> > >> > And it doesn't necessarily have to have a constructor, because the >> > thing that a RCU walk will care about is >> > >> > (a) guaranteed to be an object that *has* been on some RCU list (so >> > it's not a "new" object) >> > >> > (b) the RCU walk needs to have logic to verify that it's still the >> > *same* object and hasn't been re-used as something else. >> > >> > In contrast, a SLAB_TYPESAFE_BY_RCU memory gets free'd and re-used >> > immediately, but because it gets reused as the same kind of object, >> > the RCU walker can "know" what parts have meaning for re-use, in a way >> > it couidn't if the re-use was random. >> > >> > That said, it *is* subtle, and people should be careful. >> > >> >> So the re-use might initialize the fields lazily, not necessarily using a ctor. >> > >> > If you have a well-defined refcount, and use "atomic_inc_not_zero()" >> > to guard the speculative RCU access section, and use >> > "atomic_dec_and_test()" in the freeing section, then you should be >> > safe wrt new allocations. >> > >> > If you have a completely new allocation that has "random stale >> > content", you know that it cannot be on the RCU list, so there is no >> > speculative access that can ever see that random content. >> > >> > So the only case you need to worry about is a re-use allocation, and >> > you know that the refcount will start out as zero even if you don't >> > have a constructor. >> > >> > So you can think of the refcount itself as always having a zero >> > constructor, *BUT* you need to be careful with ordering. >> > >> > In particular, whoever does the allocation needs to then set the >> > refcount to a non-zero value *after* it has initialized all the other >> > fields. And in particular, it needs to make sure that it uses the >> > proper memory ordering to do so. >> > >> > NOTE! One thing to be very worried about is that re-initializing >> > whatever RCU lists means that now the RCU walker may be walking on >the >> > wrong list so the walker may do the right thing for this particular >> > entry, but it may miss walking *other* entries. So then you can get >> > spurious lookup failures, because the RCU walker never walked all the >> > way to the end of the right list. That ends up being a much more >> > subtle bug. >> > >> >Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >> >Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> >> >--- >> > drivers/gpu/drm/i915/i915_request.c | 49 ++++++++++++++++++--------- >> > drivers/gpu/drm/i915/i915_scheduler.c | 6 ++++ >> > drivers/gpu/drm/i915/i915_scheduler.h | 1 + >> > drivers/gpu/drm/i915/i915_sw_fence.c | 8 +++++ >> > drivers/gpu/drm/i915/i915_sw_fence.h | 2 ++ >> > 5 files changed, 50 insertions(+), 16 deletions(-) >> > >> >diff --git a/drivers/gpu/drm/i915/i915_request.c >> >b/drivers/gpu/drm/i915/i915_request.c >> >index 00011f9533b6..a558f64186fa 100644 >> >--- a/drivers/gpu/drm/i915/i915_request.c >> >+++ b/drivers/gpu/drm/i915/i915_request.c >> >@@ -188,7 +188,7 @@ static void free_capture_list(struct i915_request >> >*request) >> > { >> > struct i915_capture_list *capture; >> > >> >- capture = request->capture_list; >> >+ capture = fetch_and_zero(&request->capture_list); >> > while (capture) { >> > struct i915_capture_list *next = capture->next; >> > >> >@@ -214,7 +214,7 @@ static void remove_from_engine(struct >i915_request >> >*rq) >> > spin_lock(&engine->active.lock); >> > locked = engine; >> > } >> >- list_del(&rq->sched.link); >> >+ list_del_init(&rq->sched.link); >> > spin_unlock_irq(&locked->active.lock); >> > } >> > >> >@@ -586,6 +586,21 @@ request_alloc_slow(struct intel_timeline *tl, gfp_t >> >gfp) >> > return kmem_cache_alloc(global.slab_requests, gfp); >> > } >> > >> >+static void __i915_request_ctor(void *arg) >> >+{ >> >+ struct i915_request *rq = arg; >> >+ >> >+ spin_lock_init(&rq->lock); >> >+ i915_sched_node_init(&rq->sched); >> >+ i915_sw_fence_init(&rq->submit, submit_notify); >> >+ i915_sw_fence_init(&rq->semaphore, semaphore_notify); >> >+ >> >+ rq->file_priv = NULL; >> >+ rq->capture_list = NULL; >> >+ >> >+ INIT_LIST_HEAD(&rq->execute_cb); >> >+} >> >> My understanding of the cache ctor is that it will only get invoked when >> the cache is expanded. It does not get called on each kmem_cache_alloc(). >> >> I.e. if you do an _alloc() and there is nothing available in the cache, a new >> slab is created, and the ctor is called. > >Correct. > >> Is that the subtly that you are referring to? >> >> Is doing this ctor only when new cache slabs are created sufficient? > >That is precisely the point. As we only protect some accesses with RCU, >the request we are peeking at may be reused while we access it. The >challenge is to keep all such access valid. E.g. consider that the whole >of i915_gem_busy_ioctl() is performed only protected by the >rcu_read_lock(), and yet we need the results to be accurate even though >the requests be reallocated in the middle of it. > >The situation that spurred the ctor was taking a spinlock in an rcu path >and finding it was re-initialised in the middle of that. To requote: > > This has nothing to do with constructors. It's ok if the object gets > > reused as an object of the same type and does *not* get > > re-initialized, because we're perfectly fine seeing old stale data. Ok. I think I am following this. You are using the side effect of the ctor to protect against old data from being re-initialized until after RCU is done with it. Thanks for the clarification. Mike >-Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx