>-----Original Message----- >From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >Sent: Tuesday, June 30, 2020 3:29 PM >To: Ruhl, Michael J <michael.j.ruhl@xxxxxxxxx>; intel- >gfx@xxxxxxxxxxxxxxxxxxxxx >Subject: Re: [PATCH v2] drm/i915/gem: Move obj->lut_list under >its own lock > >Quoting Ruhl, Michael J (2020-06-30 20:11:16) >> >-----Original Message----- >> >From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of >Chris >> >Wilson >> >Sent: Monday, June 29, 2020 7:36 AM >> >To: intel-gfx@xxxxxxxxxxxxxxxxxxxxx >> >Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >> >Subject: [PATCH v2] drm/i915/gem: Move obj->lut_list under its >> >own lock >> > >> >The obj->lut_list is traversed when the object is closed as the file >> >table is destroyed during process termination. As this occurs before we >> >kill any outstanding context if, due to some bug or another, the closure >> >is blocked, then we fail to shootdown any inflight operations >> >potentially leaving the GPU spinning forever. As we only need to guard >> >the list against concurrent closures and insertions, the hold is short >> >and merits being treated as a simple spinlock. >> >> The comment: >> >> /* Break long locks, and carefully continue on from this spot */ >> >> seems to be contrary to your "the hold is short" comment. Is calling out >> this apparent worst case scenario (i.e. how it could happen), worth >> documenting? > >It's paranoia, because list iterating can be slow and historically slow >object closure has been reported as an issue. Only a few objects will be >shared between multiple contexts, and even then you would only expect a >couple of contexts to share an object. One of the OglDrvCtx would show up >here, which convinced us to move to the per-object lists to shrink the >number of elements walked. > >Even the close race igts do not cause the list to become long enough to >schedule, but it would be possible to create an object that was shared >by 64k contexts. Just not wise in practice. [However, I should make sure >an igt does hit the bookmark.] I look forward to that igt. 😊 > >> >@@ -104,21 +105,29 @@ void i915_gem_close_object(struct >> >drm_gem_object *gem, struct drm_file *file) >> > { >> > struct drm_i915_gem_object *obj = to_intel_bo(gem); >> > struct drm_i915_file_private *fpriv = file->driver_priv; >> >+ struct i915_lut_handle bookmark = {}; >> > struct i915_mmap_offset *mmo, *mn; >> > struct i915_lut_handle *lut, *ln; >> > LIST_HEAD(close); >> > >> >- i915_gem_object_lock(obj); >> >+ spin_lock(&obj->lut_lock); >> > list_for_each_entry_safe(lut, ln, &obj->lut_list, obj_link) { >> > struct i915_gem_context *ctx = lut->ctx; >> > >> >- if (ctx->file_priv != fpriv) >> >- continue; >> >+ if (ctx && ctx->file_priv == fpriv) { >> >> Null checking ctx was not done before. I think this changed with the >possible cond_resched? > >The bookmark introduces the lut->ctx == NULL to identify it as being >special, hence the need to check. Ok, clean lock replacement with a paranoid worst case scenario, just in case. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx> M >-Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx