Re: [PATCH v2 4/4] drm/i915: Implement Link Rate fallback on Link training failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 01, 2016 at 09:16:28PM +0200, Jani Nikula wrote:
> On Tue, 01 Nov 2016, Manasi Navare <manasi.d.navare@xxxxxxxxx> wrote:
> > On Tue, Nov 01, 2016 at 10:49:14AM +0200, Jani Nikula wrote:
> >> On Sat, 29 Oct 2016, Manasi Navare <manasi.d.navare@xxxxxxxxx> wrote:
> >> > If link training at a link rate optimal for a particular
> >> > mode fails during modeset's atomic commit phase, then we
> >> > let the modeset complete and then retry. We save the link rate
> >> > value at which link training failed, update the link status property
> >> > to "BAD" and use a lower link rate to prune the modes. It will redo
> >> > the modeset on the current mode at lower link rate or if the current
> >> > mode gets pruned due to lower link constraints then, it will send a
> >> > hotplug uevent for userspace to handle it.
> >> >
> >> > This is also required to pass DP CTS tests 4.3.1.3, 4.3.1.4,
> >> > 4.3.1.6.
> >> >
> >> > v2:
> >> > * Squashed a few patches (Jani Nikula)
> >> >
> >> > Cc: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx>
> >> > Cc: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> >> > Cc: Ville Syrjala <ville.syrjala@xxxxxxxxxxxxxxx>
> >> > Signed-off-by: Manasi Navare <manasi.d.navare@xxxxxxxxx>
> >> > ---
> >> >  drivers/gpu/drm/drm_atomic_helper.c           |  4 ++
> >> >  drivers/gpu/drm/i915/intel_ddi.c              | 23 ++++++++-
> >> >  drivers/gpu/drm/i915/intel_dp.c               | 74 +++++++++++++++++++++++++--
> >> >  drivers/gpu/drm/i915/intel_dp_link_training.c | 12 +++--
> >> >  drivers/gpu/drm/i915/intel_drv.h              |  5 +-
> >> >  5 files changed, 110 insertions(+), 8 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> >> > index 75ad01d..a3df3a4 100644
> >> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> >> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> >> > @@ -519,6 +519,10 @@ static int handle_conflicting_encoders(struct drm_atomic_state *state,
> >> >  					       connector_state);
> >> >  		if (ret)
> >> >  			return ret;
> >> > +
> >> > +		crtc_state = drm_atomic_get_existing_crtc_state(state, connector->state->crtc);
> >> > +		if (connector->link_status == DRM_MODE_LINK_STATUS_BAD)
> >> > +			crtc_state->connectors_changed = true;
> >> >  	}
> >> >  
> >> >  	/*
> >> > diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
> >> > index 938ac4d..319eeca 100644
> >> > --- a/drivers/gpu/drm/i915/intel_ddi.c
> >> > +++ b/drivers/gpu/drm/i915/intel_ddi.c
> >> > @@ -1684,6 +1684,8 @@ static void intel_ddi_pre_enable_dp(struct intel_encoder *encoder,
> >> >  	struct intel_dp *intel_dp = enc_to_intel_dp(&encoder->base);
> >> >  	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
> >> >  	enum port port = intel_ddi_get_encoder_port(encoder);
> >> > +	struct intel_connector *intel_connector = intel_dp->attached_connector;
> >> > +	struct drm_connector *connector = &intel_connector->base;
> >> >  
> >> >  	intel_dp_set_link_params(intel_dp, link_rate, lane_count,
> >> >  				 link_mst);
> >> > @@ -1694,7 +1696,26 @@ static void intel_ddi_pre_enable_dp(struct intel_encoder *encoder,
> >> >  	intel_prepare_dp_ddi_buffers(encoder);
> >> >  	intel_ddi_init_dp_buf_reg(encoder);
> >> >  	intel_dp_sink_dpms(intel_dp, DRM_MODE_DPMS_ON);
> >> > -	intel_dp_start_link_train(intel_dp);
> >> > +	if (!intel_dp_start_link_train(intel_dp)) {
> >> > +		DRM_DEBUG_KMS("Link Training failed at link rate = %d, lane count = %d",
> >> > +			      link_rate, lane_count);
> >> > +		intel_dp->link_train_failed = true;
> >> > +		intel_dp_get_link_train_fallback_values(intel_dp, link_rate,
> >> > +							lane_count);
> >> > +		/* Schedule a Hotplug Uevent to userspace to start modeset */
> >> > +		schedule_work(&intel_connector->modeset_retry_work);
> >> 
> >> This is not just about DDI. Need to do this for the other cases too.
> >>
> >
> > Yes, first series will g out for adding this support for DDI, then more patches
> > to expand it to non DDI platforms.
> >
> >  
> >> > +	} else {
> >> > +		DRM_DEBUG_KMS("Link Training Passed at Link Rate = %d, Lane count = %d",
> >> > +			      link_rate, lane_count);
> >> > +		intel_dp->link_train_failed = false;
> >> > +		intel_dp->fallback_link_rate_index = -1;
> >> > +		intel_dp->fallback_link_rate = 0;
> >> > +		intel_dp->fallback_lane_count = 0;
> >> > +		connector->link_status = DRM_MODE_LINK_STATUS_GOOD;
> >> > +		intel_dp_set_link_status_property(connector,
> >> > +						  DRM_MODE_LINK_STATUS_GOOD);
> >> 
> >> Looks like you never actually read connector->link_status... Why do you
> >> need both connector->link_status and intel_dp->link_train_failed? Do you
> >> think you have 4 states? What are they? Can't this all be in sync with
> >> the property?
> >> 
> >
> > This connector->link_status member of drm_Connector gets read in
> > drm_atomic_helper_check_modeset() in the driver where it reads this
> > and sets crtc_state->connector_Changed to true if this link_status
> > has changed.
> > This is required so that the driver does a complete modeset.
> > This connector->link_status was in sync with the property. But reading the
> > drm_object property in drm_atomic_helper_Check_modeset was causing the system to
> > not boot.
> > intel_dp->link_train_failed also just indicates if the link failed, I will have to see
> > if i can just use connector->link_status for this purpose. 
> 
> Please do. Usually if you add more than one variable for essentially the
> same thing, you'll end up having combinations of the variables (states)
> that you should not be in. At least you should have a clear idea what
> the states are where link_train_failed and link_status disagree.
>

I will try to converge these.
How do we handle the case where it has tried all possible fallback values,
lowest link rate and lowest lane count, now the link is still bad,
DPR expects driver to stop link training with ERROR that link training
is unsuccessful even after trying fallback. In this case we should not
try to retrain again and in that case I was just setting link_train_failed
to false again. Do you have any suggestions on how to handle this case, do we
send the uevent in this case as well? But then we dont have fallback values since
we have exhausted all fallback values.

 
> >
> >> > +	}
> >> > +
> >> >  	if (port != PORT_A || INTEL_GEN(dev_priv) >= 9)
> >> >  		intel_dp_stop_link_train(intel_dp);
> >> >  }
> >> > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> >> > index fb4fcdd..d1f0e2c 100644
> >> > --- a/drivers/gpu/drm/i915/intel_dp.c
> >> > +++ b/drivers/gpu/drm/i915/intel_dp.c
> >> > @@ -354,8 +354,14 @@ void intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
> >> >  		target_clock = fixed_mode->clock;
> >> >  	}
> >> >  
> >> > -	max_link_clock = intel_dp_max_link_rate(intel_dp);
> >> > -	max_lanes = intel_dp_max_lane_count(intel_dp);
> >> > +	/* Prune the modes using the fallback link rate/lane count */
> >> > +	if (intel_dp->link_train_failed) {
> >> > +		max_link_clock = intel_dp->fallback_link_rate;
> >> > +		max_lanes = intel_dp->fallback_lane_count;
> >> > +	} else {
> >> > +		max_link_clock = intel_dp_max_link_rate(intel_dp);
> >> > +		max_lanes = intel_dp_max_lane_count(intel_dp);
> >> > +	}
> >> >  
> >> >  	max_rate = intel_dp_max_data_rate(max_link_clock, max_lanes);
> >> >  	mode_rate = intel_dp_link_required(target_clock, 18);
> >> > @@ -1640,6 +1646,12 @@ static int intel_dp_compute_bpp(struct intel_dp *intel_dp,
> >> >  	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLCLK)
> >> >  		return false;
> >> >  
> >> > +	/* Fall back to lower link rate in case of failure in previous modeset */
> >> > +	if (intel_dp->link_train_failed) {
> >> > +		min_lane_count = max_lane_count = intel_dp->fallback_lane_count;
> >> > +		min_clock = max_clock = intel_dp->fallback_link_rate_index;
> >> > +	}
> >> > +
> >> 
> >> My general feeling is that there's starting to be a bit too much special
> >> casing around the fallback values. I'm not decided we need to fix this
> >> right away in this series, or whether it can be follow-up work.
> >> 
> >> One idea is to compute the common rates and lanes once when they're
> >> first needed, and all of the helpers would use that info. The fallback
> >> code would just trim those, and the conditional fallback stuff could be
> >> removed from all over the place.
> >>
> >
> > These fallback values get computed in a separate helper function that
> > I have added intel_dp_get_link_train_fallback_values. It is the
> > previous patch.  I store the fallback values in intel_dp one at a time
> > because for that iteration of modeset we only need to try the
> > fallback_link_rate and fallback_lane_count so we dont need an array
> > here.
> 
> Sure. I'm *not* referring to my earlier suggestion of storing failing
> link rate, lane count pairs (although I think we'll need that
> eventually).
> 
> > Are you suggesting just changing the common_rates array itself to get
> > trimmed to use the trimmed fallback values after link training fails?
> > Could you please elaborate your thought?
> 
> I'm not saying you should change this now. But having a lot of code
> check some fallback stuff is ugly and error prone. I'm just documenting
> the ideas to make this better.
> 
> I think this could be fixed by storing the common rates array and max
> lanes in intel_dp, instead of having them locally in a few functions,
> and then making the link rate and lane counting functions aware of the
> fallback stuff. Either the functions would check for the fallbacks,
> centralizing the checks in one place, and/or the fallback code would
> modify the common values stored in intel_dp, so that the link rate and
> lane counting functions would just work with the fallbacks. Code would
> be simpler overall.
> 

Again this might involve a lot more changes since we would have to change
all the helper functions that curerntly calculate max lanes and common_rates
but this can definitely be done in the next round of clean up after these 
patches land. But nevertheless I will see if I can update the common_rates
array now with fallback values instead of storing fallbck values separately
in the intel_dp structure.

Manasi
> >  
> >> >  	DRM_DEBUG_KMS("DP link computation with max lane count %i "
> >> >  		      "max bw %d pixel clock %iKHz\n",
> >> >  		      max_lane_count, common_rates[max_clock],
> >> > @@ -4423,6 +4435,13 @@ static bool intel_digital_port_connected(struct drm_i915_private *dev_priv,
> >> >  		intel_dp->compliance_test_active = 0;
> >> >  		intel_dp->compliance_test_type = 0;
> >> >  		intel_dp->compliance_test_data = 0;
> >> > +		intel_dp->link_train_failed = false;
> >> > +		intel_dp->fallback_link_rate_index = -1;
> >> > +		intel_dp->fallback_link_rate = 0;
> >> > +		intel_dp->fallback_lane_count = 0;
> >> > +		connector->link_status = DRM_MODE_LINK_STATUS_GOOD;
> >> > +		intel_dp_set_link_status_property(connector,
> >> > +						  DRM_MODE_LINK_STATUS_GOOD);
> >> >  
> >> >  		if (intel_dp->is_mst) {
> >> >  			DRM_DEBUG_KMS("MST device may have disappeared %d vs %d\n",
> >> > @@ -4514,8 +4533,12 @@ static bool intel_digital_port_connected(struct drm_i915_private *dev_priv,
> >> >  	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n",
> >> >  		      connector->base.id, connector->name);
> >> >  
> >> > +	/* If this is a retry due to link trianing failure */
> >> > +	if (status == connector_status_connected && intel_dp->link_train_failed)
> >> > +		return status;
> >> > +
> >> >  	/* If full detect is not performed yet, do a full detect */
> >> > -	if (!intel_dp->detect_done)
> >> > +	if (!intel_dp->detect_done && !intel_dp->link_train_failed)
> >> >  		status = intel_dp_long_pulse(intel_dp->attached_connector);
> >> >  
> >> >  	intel_dp->detect_done = false;
> >> > @@ -5692,6 +5715,47 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,
> >> >  	return false;
> >> >  }
> >> >  
> >> > +static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
> >> > +{
> >> > +	struct intel_connector *intel_connector;
> >> > +	struct drm_connector *connector;
> >> > +	struct drm_display_mode *mode;
> >> > +	bool verbose_prune = true;
> >> > +	bool reprobe = false;
> >> > +
> >> > +	intel_connector = container_of(work, typeof(*intel_connector),
> >> > +				       modeset_retry_work);
> >> > +	connector = &intel_connector->base;
> >> > +
> >> > +	/* Grab the locks before changing connector property*/
> >> > +	mutex_lock(&connector->dev->mode_config.mutex);
> >> > +	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id,
> >> > +		      connector->name);
> >> > +	list_for_each_entry(mode, &connector->modes, head) {
> >> > +		mode->status = intel_dp_mode_valid(connector,
> >> > +						   mode);
> >> > +		if (mode->status != MODE_OK)
> >> > +			reprobe = true;
> >> > +	}
> >> > +	drm_mode_prune_invalid(connector->dev, &connector->modes,
> >> > +			       verbose_prune);
> >> > +
> >> > +	/* Set connector link status to BAD only if modeset required
> >> > +	 * for the current mode, if mode list changed then just send uevent
> >> > +	 * so that it can reprobe the connectors and validate modes and do
> >> > +	 * a modeset on a different valid mode.
> >> > +	 */
> >> > +	if (!reprobe) {
> >> > +		connector->link_status = DRM_MODE_LINK_STATUS_BAD;
> >> > +		intel_dp_set_link_status_property(connector,
> >> > +						  DRM_MODE_LINK_STATUS_BAD);
> >> > +	}
> >> 
> >> 
> >> I think the link status property should be set to bad unconditionally
> >> here. If the current link is bad, it is bad right now independent of
> >> modes fitting into the fallback link parameters.
> >> 
> >> Which makes me think, unless I'm missing something, that you might be
> >> able to prune the invalid modes and set the property right away when
> >> link training fails, and only use the work to do
> >> drm_kms_helper_hotplug_event.
> >>
> >
> > The problem with setting the link_status property BAD irrespective of
> > the mode pruning is that, if the modes gets pruned and we set the
> > link_sttaus to BAD Chris Wilson's driver checks that the link status
> > is BAD and he first tries to attempt the modeset at the current mode
> > without calling mode_valid, and that results in a failure in
> > enocder->compute_config since now the mode does not fit and the pipe
> > cannot be configured. This creates a lot of warnings/errors/kernel
> > crash eventually.  So the best way is to set the link status as bad
> > only when we want to force the modeset at the current mode, if the
> > modes get pruned then in any case userspace will do another modeset at
> > the next lower mode.
> 
> Why would the userspace driver retry the same mode without refreshing
> the mode list first if link status is bad?
> 
> If the link is bad, the link is *bad*, regardless of whether the current
> mode might eventually work or not. An interface that reports good on
> some values of bad is inconsistent (related reading [1]). IMO the
> userspace should first figure out if the current mode is valid or not,
> and then decide what to do.
> 
> Chris, any comments?
> 
> Side note, the kernel must not crash depending on what the userspace
> does.
> 
> BR,
> Jani.
> 

Chris, yes I think it is a good idea to always call mode_valid first on recieveing
link_status as BAD and then calling setcrtc to set the current mode or whatever valid mode
is. Then I would not need to set link status BAD conditionally in the driver.
Could you make this change in your link_status patch and resend it to me?

Manasi

> 
> [1] http://sweng.the-davies.net/Home/rustys-api-design-manifesto
> 
> >
> > Manasi 
> >> > +	mutex_unlock(&connector->dev->mode_config.mutex);
> >> > +
> >> > +	/* Send Hotplug uevent so userspace can reprobe */
> >> > +	drm_kms_helper_hotplug_event(connector->dev);
> >> > +}
> >> > +
> >> >  bool
> >> >  intel_dp_init_connector(struct intel_digital_port *intel_dig_port,
> >> >  			struct intel_connector *intel_connector)
> >> > @@ -5704,6 +5768,10 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,
> >> >  	enum port port = intel_dig_port->port;
> >> >  	int type;
> >> >  
> >> > +	/* Initialize the work for modeset in case of link train failure */
> >> > +	INIT_WORK(&intel_connector->modeset_retry_work,
> >> > +		  intel_dp_modeset_retry_work_fn);
> >> > +
> >> >  	if (WARN(intel_dig_port->max_lanes < 1,
> >> >  		 "Not enough lanes (%d) for DP on port %c\n",
> >> >  		 intel_dig_port->max_lanes, port_name(port)))
> >> > diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c b/drivers/gpu/drm/i915/intel_dp_link_training.c
> >> > index 0048b52..10f81ab 100644
> >> > --- a/drivers/gpu/drm/i915/intel_dp_link_training.c
> >> > +++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
> >> > @@ -310,9 +310,15 @@ void intel_dp_stop_link_train(struct intel_dp *intel_dp)
> >> >  				DP_TRAINING_PATTERN_DISABLE);
> >> >  }
> >> >  
> >> > -void
> >> > +bool
> >> >  intel_dp_start_link_train(struct intel_dp *intel_dp)
> >> >  {
> >> > -	intel_dp_link_training_clock_recovery(intel_dp);
> >> > -	intel_dp_link_training_channel_equalization(intel_dp);
> >> > +	bool ret;
> >> > +
> >> > +	if (intel_dp_link_training_clock_recovery(intel_dp)) {
> >> > +		ret = intel_dp_link_training_channel_equalization(intel_dp);
> >> > +		if (ret)
> >> > +			return true;
> >> > +	}
> >> > +	return false;
> >> >  }
> >> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> >> > index bc25b2b..a54e9b7 100644
> >> > --- a/drivers/gpu/drm/i915/intel_drv.h
> >> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> >> > @@ -312,6 +312,9 @@ struct intel_connector {
> >> >  	void *port; /* store this opaque as its illegal to dereference it */
> >> >  
> >> >  	struct intel_dp *mst_port;
> >> > +
> >> > +	/* Work struct to schedule a uevent on link train failure */
> >> > +	struct work_struct modeset_retry_work;
> >> >  };
> >> >  
> >> >  struct dpll {
> >> > @@ -1402,7 +1405,7 @@ void intel_dp_set_link_params(struct intel_dp *intel_dp,
> >> >  			      bool link_mst);
> >> >  void intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
> >> >  					     int link_rate, uint8_t lane_count);
> >> > -void intel_dp_start_link_train(struct intel_dp *intel_dp);
> >> > +bool intel_dp_start_link_train(struct intel_dp *intel_dp);
> >> >  void intel_dp_stop_link_train(struct intel_dp *intel_dp);
> >> >  void intel_dp_sink_dpms(struct intel_dp *intel_dp, int mode);
> >> >  void intel_dp_encoder_reset(struct drm_encoder *encoder);
> >> 
> >> -- 
> >> Jani Nikula, Intel Open Source Technology Center
> 
> -- 
> Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux