Re: [PATCH] drm/i915: Change semantics of context isolation reporting to UM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 04, 2022 at 02:24:07PM +0200, Daniel Vetter wrote:
> On Fri, 29 Apr 2022 at 17:11, Adrian Larumbe
> <adrian.larumbe@xxxxxxxxxxxxx> wrote:
> > I915_PARAM_HAS_CONTEXT_ISOLATION was already being used as a boolean by
> > both Iris and Vulkan , and stood for the guarantee that, when creating a
> > new context, all state set by it will not leak to any other context.
> >
> > However the actual return value was a bitmask where every bit stood for an
> > initialised engine, and IGT test gem_ctx_isolation makes use of this mask
> > for deciding on the actual context engine isolation status.
> >
> > However, we do not provide UAPI for IGT tests, so the value returned by the
> > PARAM ioctl has to reflect Mesa usage as a boolean.
> >
> > This change only made sense after compute engine support was added to the
> > driver in commit 944823c9463916dd53f3 ("drm/i915/xehp: Define compute class
> > and engine") because no context isolation can be assumed on any device with
> > both RCS annd CCS engines.
> >
> > Signed-off-by: Adrian Larumbe <adrian.larumbe@xxxxxxxxxxxxx>
> 
> Top level post and adding Matt Roper and dri-devel.
> 
> This was meant as a simple cleanup after CCS enabling in upstream, but
> that CCS enabling seems to have gone wrong.
> 
> What I thought we should be done for CCS enabling is the following:
> - actually have some igt-side hardcoded assumption about how much
> engines are isolated from each another, which is a hw property. I
> think some of that landed, but it's very incomplete
> 
> - convert all igt tests over to that. At least gem_ctx_isolation.c is
> not converted over, as Adrian pointed out.

I pointed that out last week in one of our offline syncs and that's what
got the ball rolling on that test again.  But you specifically told us
that the uapi cleanup for context isolation shouldn't block the CCS
patches from landing since that was still happening in parallel:

    "...I do see the uapi cleanup as part of this multi engine/CCS
    enabling, but it's not a blocker to land the patches..."

Did we misunderstand what you were trying to say in that email or was
there a change of direction here?


Matt

> 
> - once igt stopped using this context isolation getparam (we do not,
> ever, create uapi just for testcases), fix up the uapi to what iris
> actually needs, which is _only_ a boolean which indicates whether the
> kernel's context setup code leaks register state from existing
> contexts to newly created ones. Which is the bug iris works around
> here, where using iris caused gpu hangs in libva. Iow, the kernel
> should always and unconditionally return true here. Check out iris
> history for details please, actual iris usage has nothing to do with
> any other cross-context or cross-engine isolation guarantee we're
> making, it's purely about whether our hw ctx code is buggy or not and
> leaks state between clients, because we accidentally used the
> currently running ctx as template instead of a fixed one created once
> at driver load.
> 
> Matt, since the CCS enabling on the igt validation side looks very
> incomplete I'm leaning very much towards "pls revert, try again".
> 
> Cheers, Daniel
> 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_engine_user.c | 13 ++++++++++++-
> >  drivers/gpu/drm/i915/gt/intel_engine_user.h |  1 +
> >  drivers/gpu/drm/i915/i915_drm_client.h      |  2 +-
> >  drivers/gpu/drm/i915/i915_getparam.c        |  2 +-
> >  include/uapi/drm/i915_drm.h                 | 14 +++-----------
> >  5 files changed, 18 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> > index 0f6cd96b459f..2d6bd36d6150 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> > @@ -47,7 +47,7 @@ static const u8 uabi_classes[] = {
> >         [COPY_ENGINE_CLASS] = I915_ENGINE_CLASS_COPY,
> >         [VIDEO_DECODE_CLASS] = I915_ENGINE_CLASS_VIDEO,
> >         [VIDEO_ENHANCEMENT_CLASS] = I915_ENGINE_CLASS_VIDEO_ENHANCE,
> > -       /* TODO: Add COMPUTE_CLASS mapping once ABI is available */
> > +       [COMPUTE_CLASS] = I915_ENGINE_CLASS_COMPUTE,
> >  };
> >
> >  static int engine_cmp(void *priv, const struct list_head *A,
> > @@ -306,3 +306,14 @@ unsigned int intel_engines_has_context_isolation(struct drm_i915_private *i915)
> >
> >         return which;
> >  }
> > +
> > +bool intel_cross_engine_isolated(struct drm_i915_private *i915)
> > +{
> > +       unsigned int which = intel_engines_has_context_isolation(i915);
> > +
> > +       if ((which & BIT(I915_ENGINE_CLASS_RENDER)) &&
> > +           (which & BIT(I915_ENGINE_CLASS_COMPUTE)))
> > +               return false;
> > +
> > +       return !!which;
> > +}
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.h b/drivers/gpu/drm/i915/gt/intel_engine_user.h
> > index 3dc7e8ab9fbc..ff21349db4d4 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_user.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.h
> > @@ -15,6 +15,7 @@ struct intel_engine_cs *
> >  intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
> >
> >  unsigned int intel_engines_has_context_isolation(struct drm_i915_private *i915);
> > +bool intel_cross_engine_isolated(struct drm_i915_private *i915);
> >
> >  void intel_engine_add_user(struct intel_engine_cs *engine);
> >  void intel_engines_driver_register(struct drm_i915_private *i915);
> > diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h
> > index 5f5b02b01ba0..f796c5e8e060 100644
> > --- a/drivers/gpu/drm/i915/i915_drm_client.h
> > +++ b/drivers/gpu/drm/i915/i915_drm_client.h
> > @@ -13,7 +13,7 @@
> >
> >  #include "gt/intel_engine_types.h"
> >
> > -#define I915_LAST_UABI_ENGINE_CLASS I915_ENGINE_CLASS_VIDEO_ENHANCE
> > +#define I915_LAST_UABI_ENGINE_CLASS I915_ENGINE_CLASS_COMPUTE
> >
> >  struct drm_i915_private;
> >
> > diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
> > index c12a0adefda5..3d5120d2d78a 100644
> > --- a/drivers/gpu/drm/i915/i915_getparam.c
> > +++ b/drivers/gpu/drm/i915/i915_getparam.c
> > @@ -145,7 +145,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
> >                 value = 1;
> >                 break;
> >         case I915_PARAM_HAS_CONTEXT_ISOLATION:
> > -               value = intel_engines_has_context_isolation(i915);
> > +               value = intel_cross_engine_isolated(i915);
> >                 break;
> >         case I915_PARAM_SLICE_MASK:
> >                 value = sseu->slice_mask;
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 35ca528803fd..84c0af77cc1f 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -166,6 +166,7 @@ enum drm_i915_gem_engine_class {
> >         I915_ENGINE_CLASS_COPY          = 1,
> >         I915_ENGINE_CLASS_VIDEO         = 2,
> >         I915_ENGINE_CLASS_VIDEO_ENHANCE = 3,
> > +       I915_ENGINE_CLASS_COMPUTE       = 4,
> >
> >         /* should be kept compact */
> >
> > @@ -635,17 +636,8 @@ typedef struct drm_i915_irq_wait {
> >  #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
> >
> >  /*
> > - * Query whether every context (both per-file default and user created) is
> > - * isolated (insofar as HW supports). If this parameter is not true, then
> > - * freshly created contexts may inherit values from an existing context,
> > - * rather than default HW values. If true, it also ensures (insofar as HW
> > - * supports) that all state set by this context will not leak to any other
> > - * context.
> > - *
> > - * As not every engine across every gen support contexts, the returned
> > - * value reports the support of context isolation for individual engines by
> > - * returning a bitmask of each engine class set to true if that class supports
> > - * isolation.
> > + * Query whether the device can make cross-engine isolation guarantees for
> > + * all the engines whose default state has been initialised.
> >   */
> >  #define I915_PARAM_HAS_CONTEXT_ISOLATION 50
> >
> > --
> > 2.35.1
> >
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux