Re: [Intel-gfx] [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 13 Apr 2016, Daniel Vetter <daniel@xxxxxxxx> wrote:
> On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
>> Two concurrent writes into the same register cacheline has the chance of
>> killing the machine on Ivybridge and other gen7. This includes LRI
>> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
>> serialising barrier and prevents the pair of register writes in the first
>> packet from triggering the fault.  However, if a second switch-context
>> immediately occurs then we may have two adjacent blocks of LRI to the
>> same registers which may then trigger the hang. To counteract this we
>> need to insert a delay after the second register write using SRM.
>> 
>> This is easiest to reproduce with something like
>> igt/gem_ctx_switch/interruptible that triggers back-to-back context
>> switches (with no operations in between them in the command stream,
>> which requires the execbuf operation to be interrupted after the
>> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
>> interruptible igt. No reports from the wild though, so it must be of low
>> enough frequency that no one has correlated the random machine freezes
>> with i915.ko
>> 
>> The issue was introduced with
>> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
>> Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
>> Date:   Tue Dec 16 10:02:27 2014 +0000
>> 
>>     drm/i915: Disable PSMI sleep messages on all rings around context switches
>> 
>> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
>> Cc: Daniel Vetter <daniel@xxxxxxxx>
>> Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
>> Cc: stable@xxxxxxxxxxxxxxx
>
> Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx>

FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.

BR,
Jani.

(*) Well, not exactly *this* but rather
https://patchwork.freedesktop.org/patch/80952/ which was not posted on
the list so I can't reply to it.


>
>> ---
>>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>>  1 file changed, 12 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>> index fe580cb9501a..e5ad7b21e356 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  
>>  	len = 4;
>>  	if (INTEL_INFO(engine->dev)->gen >= 7)
>> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
>> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>>  
>>  	ret = intel_ring_begin(req, len);
>>  	if (ret)
>> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>>  		if (num_rings) {
>>  			struct intel_engine_cs *signaller;
>> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>>  
>>  			intel_ring_emit(engine,
>>  					MI_LOAD_REGISTER_IMM(num_rings));
>> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  				if (signaller == engine)
>>  					continue;
>>  
>> -				intel_ring_emit_reg(engine,
>> -						    RING_PSMI_CTL(signaller->mmio_base));
>> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
>> +				intel_ring_emit_reg(engine, last_reg);
>>  				intel_ring_emit(engine,
>>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>>  			}
>> +
>> +			/* Insert a delay before the next switch! */
>> +			intel_ring_emit(engine,
>> +					MI_STORE_REGISTER_MEM |
>> +					MI_SRM_LRM_GLOBAL_GTT);
>> +			intel_ring_emit_reg(engine, last_reg);
>> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
>> +			intel_ring_emit(engine, MI_NOOP);
>>  		}
>>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>>  	}
>> -- 
>> 2.8.0.rc3
>> 

-- 
Jani Nikula, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]