Re: [intel-gfx] [PATCH] drm/i915: avoid concurrent writes to aux_inv (rev3)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chris, for some reason I didn’t receive the review email, so I copied your comments from patchwork and faked this email.

 

>>  static void execlists_dequeue(struct intel_engine_cs *engine)

>>  {

>>         struct intel_engine_execlists * const execlists = &engine->execlists;

>> @@ -1538,6 +1566,16 @@ static void execlists_dequeue(struct intel_engine_cs *engine)

>>                         }

>> 

>>                         if (__i915_request_submit(rq)) {

>> +                               /* hsdes: 1809175790 */

>> +                               if ((GRAPHICS_VER(engine->i915) == 12) &&

>> +                                   rq->vd_ve_aux_inv &&

>> +                                   (engine->class == VIDEO_DECODE_CLASS ||

>> +                                    engine->class == VIDEO_ENHANCEMENT_CLASS)) {

 

> We do not need the extra checks here; we just do as we are told. We only

> tell ourselves to apply the fixup when required.

 

Without checking GRAPHICS_VER, I’m seeing a lot of regressions on older platforms in the CI result.

This workaround was only implemented for gen12 (gen12_emit_flush_xcs).

Without checking engine->class, I’m seeing boot issues due to GEM_BUG_ON() in aux_inv_reg().

Any rq will go through this code regardless of engine class and gen version, so the checking seems to be

necessary.

 

>> +                                       *rq->vd_ve_aux_inv = i915_mmio_reg_offset

 

> Likewise, vd_ve is overspecific, aux_inv_fixup or aux_inv_wa (or

> wa_aux_iv, fixup_aux_inv).

 

I wanted to be specific because the workaround was only implemented for vd/ve engines.

But I’m ok with your proposal.

 

>> +                                               (aux_inv_reg(engine));

>> +                                       rq->vd_ve_aux_inv = NULL;

 

> Move this to i915_request initialisation so that we only set aux_inv

> when required, which probably explains the extra defence.

 

The pointer is currently initialized with 0x5a5a. I set it to NULL in gen12_emit_flush_xcs, otherwise the rq will

enter that if-statement with an invalid pointer.

I’m not familiar with the code, there seems to be multiple functions allocating the structure. I agree it’s better

to set it to NULL at initialization, but need some guidance on where is the best place to do so.

 

>> +                                       rq->execution_mask = engine->mask;

>> +                               }

>>                                 if (!merge) {

>>                                         *port++ = i915_request_get(last);

>>                                         last = NULL;

>> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h

>> index 28b1f9db5487..69de32e5e15d 100644

>> --- a/drivers/gpu/drm/i915/i915_request.h

>> +++ b/drivers/gpu/drm/i915/i915_request.h

>> @@ -350,6 +350,8 @@ struct i915_request {

>>                 struct list_head link;

>>                 unsigned long delay;

>>         } mock;)

>> +

>> +       u32 *vd_ve_aux_inv;

 

> Not at the end of the struct; that's where we put things in the dungeon.

> The selftest struct should be last; I do hope no one has been putting

> things at random places in the struct without considering the layout and

> semantics. From the flow, this is akin to batch, capture_list; before

> emitted_jiffies would be a good spot.

 

Got it, will change. I thought adding at the end would be safer, thanks for the explanation.

 

> -Chris

 


[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux