Re: [PATCH] drm/i915: Enable scanline read for gen9 dsi

Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> · Mon, 11 Sep 2017 20:50:19 +0300

On Mon, Sep 11, 2017 at 01:04:18PM +0000, Shankar, Uma wrote:
> 
> 
> >-----Original Message-----
> >From: Ville Syrjälä [mailto:ville.syrjala@xxxxxxxxxxxxxxx]
> >Sent: Friday, September 8, 2017 8:18 PM
> >To: Srinivas, Vidya <vidya.srinivas@xxxxxxxxx>
> >Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; Kahola, Mika <mika.kahola@xxxxxxxxx>;
> >Kamath, Sunil <sunil.kamath@xxxxxxxxx>; Shankar, Uma
> ><uma.shankar@xxxxxxxxx>; Konduru, Chandra <chandra.konduru@xxxxxxxxx>
> >Subject: Re: [PATCH] drm/i915: Enable scanline read for gen9 dsi
> >
> >On Fri, Sep 08, 2017 at 07:18:55PM +0530, Vidya Srinivas wrote:
> >> From: Uma Shankar <uma.shankar@xxxxxxxxx>
> >>
> >> For gen9 platforms, dsi timings are driven from port instead of pipe
> >> (unlike ddi). Thus, we can't rely on pipe registers to get the timing
> >> information. Even scanline register read will not be functional.
> >> This is causing vblank evasion logic to fail since it relies on
> >> scanline, causing atomic update failure warnings.
> >>
> >> This patch uses pipe framestamp and current timestamp registers to
> >> calculate scanline. This is an indirect way to get the scanline.
> >> It helps resolve atomic update failure for gen9 dsi platforms.
> >>
> >> Signed-off-by: Uma Shankar <uma.shankar@xxxxxxxxx>
> >> Signed-off-by: Chandra Konduru <chandra.konduru@xxxxxxxxx>
> >> Signed-off-by: Vidya Srinivas <vidya.srinivas@xxxxxxxxx>
> >> ---
> >>  drivers/gpu/drm/i915/i915_drv.h  |  2 ++
> >> drivers/gpu/drm/i915/i915_irq.c  |  5 +++++
> >> drivers/gpu/drm/i915/i915_reg.h  |  3 +++
> >> drivers/gpu/drm/i915/intel_dsi.c | 46
> >> ++++++++++++++++++++++++++++++++++++++++
> >>  4 files changed, 56 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> >> b/drivers/gpu/drm/i915/i915_drv.h index d07d110..4213b54 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -4077,6 +4077,8 @@ void intel_sbi_write(struct drm_i915_private
> >> *dev_priv, u16 reg, u32 value,
> >>  u32 vlv_flisdsi_read(struct drm_i915_private *dev_priv, u32 reg);
> >> void vlv_flisdsi_write(struct drm_i915_private *dev_priv, u32 reg, u32
> >> val);
> >>
> >> +u32 bxt_dsi_get_scanline(struct intel_crtc *crtc);
> >> +
> >>  /* intel_dpio_phy.c */
> >>  void bxt_port_to_phy_channel(struct drm_i915_private *dev_priv, enum port
> >port,
> >>  			     enum dpio_phy *phy, enum dpio_channel *ch); diff --
> >git
> >> a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> >> index 5d391e6..31aa7f0 100644
> >> --- a/drivers/gpu/drm/i915/i915_irq.c
> >> +++ b/drivers/gpu/drm/i915/i915_irq.c
> >> @@ -781,6 +781,7 @@ static int __intel_get_crtc_scanline(struct intel_crtc
> >*crtc)
> >>  	struct drm_vblank_crtc *vblank;
> >>  	enum pipe pipe = crtc->pipe;
> >>  	int position, vtotal;
> >> +	enum transcoder cpu_transcoder;
> >>
> >>  	if (!crtc->active)
> >>  		return -1;
> >> @@ -792,6 +793,10 @@ static int __intel_get_crtc_scanline(struct intel_crtc
> >*crtc)
> >>  	if (mode->flags & DRM_MODE_FLAG_INTERLACE)
> >>  		vtotal /= 2;
> >>
> >> +	cpu_transcoder = crtc->config->cpu_transcoder;
> >
> >Humm. Would be nice to be able to do this without adding more
> >crtc->config uses. We're pretty much trying to get rid of that guy.
> >
> 
> Will try to find an alternate way to do this.
> 
> >> +	if (IS_BROXTON(dev_priv) && transcoder_is_dsi(cpu_transcoder))
> >> +		return bxt_dsi_get_scanline(crtc);
> >> +
> >>  	if (IS_GEN2(dev_priv))
> >>  		position = I915_READ_FW(PIPEDSL(pipe)) &
> >DSL_LINEMASK_GEN2;
> >>  	else
> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h
> >> b/drivers/gpu/drm/i915/i915_reg.h index 9a73ea0..54582de 100644
> >> --- a/drivers/gpu/drm/i915/i915_reg.h
> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
> >> @@ -8802,6 +8802,9 @@ enum skl_power_gate {
> >>  #define MIPIO_TXESC_CLK_DIV2			_MMIO(0x160008)
> >>  #define  GLK_TX_ESC_CLK_DIV2_MASK			0x3FF
> >>
> >> +#define BXT_TIMESTAMP_CTR	_MMIO(0x44070)
> >> +#define BXT_PIPE_FRMTMSTMP_A	_MMIO(0x70048)
> >
> >Please add proper parametrized define that works for all pipes.
> >
> 
> Will add that.
> 
> >> +
> >>  /* BXT MIPI clock controls */
> >>  #define BXT_MAX_VAR_OUTPUT_KHZ			39500
> >>
> >> diff --git a/drivers/gpu/drm/i915/intel_dsi.c
> >> b/drivers/gpu/drm/i915/intel_dsi.c
> >> index 2a0f5d3..d145ba4 100644
> >> --- a/drivers/gpu/drm/i915/intel_dsi.c
> >> +++ b/drivers/gpu/drm/i915/intel_dsi.c
> >> @@ -1621,6 +1621,52 @@ static int intel_dsi_get_modes(struct
> >drm_connector *connector)
> >>  	return 1;
> >>  }
> >>
> >> +/*
> >> + * For Gen9 DSI, pipe scanline register will not
> >> + * work to get the scanline since the timings
> >> + * are driven from the PORT (unlike DDI encoders).
> >> + * This function will use Framestamp and current
> >> + * timestamp registers to calculate the scanline.
> >> + */
> >> +u32 bxt_dsi_get_scanline(struct intel_crtc *crtc) {
> >> +	struct drm_device *dev = crtc->base.dev;
> >> +	struct drm_i915_private *dev_priv = to_i915(dev);
> >> +	u32 vrefresh = crtc->base.mode.vrefresh;
> >> +	u32 ulPrevTime, ulCurrTime, vtotal, ulScanlineNo2 = 0;
> >
> >Please get rid of the hungarian notation.
> >
> 
> Yes, will fix this.
> 
> >> +	uint_fixed_16_16_t ulScanlineTime;
> >> +
> >> +	/*
> >> +	 * This field provides read back of the display
> >> +	 * pipe frame time stamp. The time stamp value
> >> +	 * is sampled at every start of vertical blank.
> >> +	 */
> >> +	ulPrevTime = I915_READ_FW(BXT_PIPE_FRMTMSTMP_A);
> >> +
> >> +	/*
> >> +	 * The TIMESTAMP_CTR register has the current
> >> +	 * time stamp value.
> >> +	 */
> >> +	ulCurrTime = I915_READ_FW(BXT_TIMESTAMP_CTR);
> >> +
> >> +	/* The PORT for DSI will always be 0 since
> >> +	 * isolated PORTC cannot be enabled for Gen9
> >> +	 * DSI. Hence using PORT_A i.e 0 to extract
> >> +	 * the VTOTAL value.
> >> +	 */
> >> +	vtotal = I915_READ_FW(BXT_MIPI_TRANS_VTOTAL(0));
> >
> >This value can be dug out from the hwmode.
> >
> 
> Yes, will get it from hwmode and drop this change.
> 
> >> +	WARN_ON(!vtotal);
> >> +	if (!vtotal)
> >> +		return ulScanlineNo2;
> >> +
> >> +	ulScanlineTime = div_fixed16(1000000, vtotal * vrefresh);
> >> +	ulScanlineNo2 = div_round_up_u32_fixed16((ulCurrTime - ulPrevTime),
> >> +						ulScanlineTime);
> >
> >Something like:
> >scanline = div_u64(mul_u32_u32(curr - prev, crtc_clock),
> >		   1000 * crtc_htotal);
> >
> >> +	ulScanlineNo2 = (ulScanlineNo2 + vtotal) % vtotal;
> >
> >I think that would have to be something like:
> >return (scanline + vblank_start) % vtotal;
> >
> 
> Yes you are right. It should be vblank_start. Will fix this.
> 
> >All in all this looks like a pretty decent approach to the DSI problem.
> >
> >One concern here is rounding issues and inaccuracies in our crtc_clock. But since
> >the frame timestamp is sampled at vblank start I guess we can't accidentally get
> >an answer that's earlier than vblank_start as long as we really passed vblank start
> >already. That should make this at least suitable for vblank timestamps. 
> 
> I also feel the same, this situation should never occur.
> 
> >And for
> >the atomic evade, I guess if we clamp our the scanline before the
> >+vblank_start such that it never reaches vtotal, we can't be sure that
> >our vblank evade never indicates that we already reached the start of vblank
> >prematurely.
> >
> >So maybe something like:
> >scaline = div_u64(...);
> >scanline = min(scanline, vtotal - 1);
> 
> I am not sure if the value of scanline returned can ever be greater than vtotal -1.
> But we can have a check just to be safe. Not sure if I fully got your point here.

The point is that the timestamp counter might tick at a slightly faster
rate than we might think. Thus we might end up with more ticks in one
frame than what we calculated as the maximum fom crtc_clock etc. But if
we clamp the value like I suggested then at least we should never get
an answer that tells us we're already past the start of vblank when in
reality we're not.

Of course as Daniel pointed out we might also get into trouble if the
counter ticks slower than expected. That could lead us to think that
we don't need to do the vblank evade when in fact we do.

Oh and there's maybe another race lurking here. We might cross into the
next vblank just between the PIPE_FRMTMSTMP and TIMESTAMP_CTR reads. If
that happens we get an answer that's definitely too big for one frame.
I guess we could avoid that particular problem by making sure we really
read PIPE_FRMTMSTMP and TIMESTAMP_CTR during the same frame. Eg.
something like:

do {
	prev = PIPE_FRMTMSTMP;
	curr = TIMESTAMP_CTR
	post = PIPE_FRMTMSTMP
} while (prev != post);

> 
> >return (scanline + vblank_start) % vtotal;
> >
> >At least that's my thinking atm. Feel free to rip my reasoning to shreds if you think
> >I'm totally wrong here.
> >
> 
> One more thing we missed is, that the current timestamp is just a 32 bit register value.
> It can overflow and wrap around. So a situation can come, where current timestamp will
> be less than prev timestamp (read from frame time stamp reg). We need to handle that
> situation as well.  Will fix that in the next version and resend.

Modulo 2^32 math will handle that just fine.

> 
> Thanks Ville for your valuable review comments.
> 
> Regards,
> Uma Shankar
> 
> >
> >> +
> >> +	return ulScanlineNo2;
> >> +}
> >> +
> >>  static void intel_dsi_connector_destroy(struct drm_connector
> >> *connector)  {
> >>  	struct intel_connector *intel_connector =
> >> to_intel_connector(connector);
> >> --
> >> 1.9.1
> >
> >--
> >Ville Syrjälä
> >Intel OTC

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx