Re: [PATCH 3/4] drm/i915: Fix random aux transactions failures.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 10/21/2015 12:53 PM, Daniel Vetter wrote:
On Wed, Oct 21, 2015 at 09:18:06AM +0200, Daniel Vetter wrote:
On Wed, Oct 21, 2015 at 10:28:53AM -0700, Rodrigo Vivi wrote:
Mainly aux communications on sink_crc
were failing a lot randomly on recent platforms.
The first solution was to try to use intel_dp_dpcd_read_wake, but then
it was suggested to move retries to drm level.

Since drm level was already taking care of retries and didn't want
to through random retries on that level the second solution was to
put the retries at aux_transfer layer what was nacked.

So I realized we had so many retries in different places and
started to organize that a bit. During this organization I noticed
that we weren't handing at all the case were the message size was
zeroed. And this was exactly the case that was affecting sink_crc.

Also we weren't respect BSPec who says this size message = 0 or > 20
are forbidden.

It is a fact that we still have no clue why we are getting this
forbidden value there. But anyway we need to handle that for now
so we return -EBUSY and drm level takes care of the retries that
are already in place.

Cc: Jani Nikula <jani.nikula@xxxxxxxxx>
Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx>
---
  drivers/gpu/drm/i915/intel_dp.c | 11 +++++++++++
  1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index aa3d8f6..80850d6 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -911,6 +911,17 @@ done:
  	/* Unload any bytes sent back from the other side */
  	recv_bytes = ((status & DP_AUX_CH_CTL_MESSAGE_SIZE_MASK) >>
  		      DP_AUX_CH_CTL_MESSAGE_SIZE_SHIFT);
+
+	/*
+	 * By BSpec: "Message sizes of 0 or >20 are not allowed."
+	 * We have no idea of what happened so we return -EBUSY so
+	 * drm layer takes care for the necessary retries.
+	 */
+	if (recv_bytes == 0 || recv_bytes > 20) {
+		ret = -EBUSY;
+		goto out;
+	}
Hm, this should be caught be the dp aux helper library. Both callers for
->transfer should check for this and reject with -EINVAL (since such a
transaction is simply not allowed by dp aux). In the case of
drm_dp_i2c_do_msg maybe even with a WARN_ON since the i2c logic should
split things up correctly.
Meh, totally misread what's going on here, this is from the hardware. How
does this even happen? Do you get some kind of garbage value? Should we
maybe clear this register field first? It certainly would explain a lot of
our random dp aux retry fun ...
-Daniel
we are already checking for read size in intel_dp_aux_transfer

 if (WARN_ON(rxsize > 20))
               return -E2BIG;

and again inside intel_dp_aux_ch

 843         /* Only 5 data registers! */
 844         if (WARN_ON(send_bytes > 20 || recv_size > 20)) {
 845                 ret = -E2BIG;
 846                 goto out;
 847         }

Also isn't it possible that a simple write command will have 0 for receive size ?
can you share bit more details on what scenario this patch is helping ?

regards,
Sivakumar

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux