> -----Original Message----- > From: Kahola, Mika <mika.kahola@xxxxxxxxx> > Sent: Friday, November 3, 2023 4:47 PM > To: Kahola, Mika <mika.kahola@xxxxxxxxx>; Sousa, Gustavo <gustavo.sousa@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Nikula, Jani <jani.nikula@xxxxxxxxx>; Syrjala, Ville <ville.syrjala@xxxxxxxxx> > Subject: RE: [PATCH] drm/i915/mtl: Clear possible sticky bits on PICA message bus > > > -----Original Message----- > > From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of > > Kahola, Mika > > Sent: Thursday, November 2, 2023 4:54 PM > > To: Sousa, Gustavo <gustavo.sousa@xxxxxxxxx>; > > intel-gfx@xxxxxxxxxxxxxxxxxxxxx > > Cc: Nikula, Jani <jani.nikula@xxxxxxxxx>; Syrjala, Ville > > <ville.syrjala@xxxxxxxxx> > > Subject: Re: [PATCH] drm/i915/mtl: Clear possible sticky > > bits on PICA message bus > > > > > -----Original Message----- > > > From: Sousa, Gustavo <gustavo.sousa@xxxxxxxxx> > > > Sent: Thursday, November 2, 2023 4:23 PM > > > To: Kahola, Mika <mika.kahola@xxxxxxxxx>; > > > intel-gfx@xxxxxxxxxxxxxxxxxxxxx > > > Cc: Kahola, Mika <mika.kahola@xxxxxxxxx>; Nikula, Jani > > > <jani.nikula@xxxxxxxxx>; Syrjala, Ville <ville.syrjala@xxxxxxxxx> > > > Subject: Re: [PATCH] drm/i915/mtl: Clear possible sticky bits on > > > PICA message bus > > > > > > Quoting Mika Kahola (2023-11-01 07:31:01-03:00) > > > >It is possible that sticky bits or error bits are left on message > > > >bus status register. Reading and then writing the value back to > > > >messagebus status register clears all possible sticky bits and errors. > > > > > > > >Signed-off-by: Mika Kahola <mika.kahola@xxxxxxxxx> > > > >--- > > > > drivers/gpu/drm/i915/display/intel_cx0_phy.c | 14 ++++++++++++++ > > > > 1 file changed, 14 insertions(+) > > > > > > > >diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy.c > > > >b/drivers/gpu/drm/i915/display/intel_cx0_phy.c > > > >index b2ad4c6172f6..f439f0c7b400 100644 > > > >--- a/drivers/gpu/drm/i915/display/intel_cx0_phy.c > > > >+++ b/drivers/gpu/drm/i915/display/intel_cx0_phy.c > > > >@@ -195,6 +195,13 @@ static int __intel_cx0_read_once(struct drm_i915_private *i915, enum port port, > > > > return -ETIMEDOUT; > > > > } > > > > > > > >+ /* > > > >+ * write XELPDP_PORT_P2M_MSGBUS_STATUS register after read to clear > > > >+ * any error sticky bits set from previous transactions > > > >+ */ > > > >+ val = intel_de_read(i915, XELPDP_PORT_P2M_MSGBUS_STATUS(port, lane)); > > > >+ intel_de_write(i915, XELPDP_PORT_P2M_MSGBUS_STATUS(port, > > > >+ lane), val); > > > >+ > > > > intel_de_write(i915, XELPDP_PORT_M2P_MSGBUS_CTL(port, lane), > > > > XELPDP_PORT_M2P_TRANSACTION_PENDING | > > > > XELPDP_PORT_M2P_COMMAND_READ | @@ -262,6 > > > >+269,13 @@ static int __intel_cx0_write_once(struct > > > >+drm_i915_private *i915, enum port port, > > > > return -ETIMEDOUT; > > > > } > > > > > > > >+ /* > > > >+ * write XELPDP_PORT_P2M_MSGBUS_STATUS register after read to clear > > > >+ * any error sticky bits set from previous transactions > > > >+ */ > > > >+ val = intel_de_read(i915, XELPDP_PORT_P2M_MSGBUS_STATUS(port, lane)); > > > >+ intel_de_write(i915, XELPDP_PORT_P2M_MSGBUS_STATUS(port, > > > >+ lane), val); > > > >+ > > > > > > Looking at the current state of the code, looks like to me that we already clear the bits from both the "success" and "failure" > > paths. > > > For the "success" > > > paths, that is done by a direct call to > > > intel_clear_response_ready_flag(); for the "failure" case, the call > > > to > > > intel_clear_response_ready_flag() is done as part of intel_cx0_bus_reset(). > > > > > > Thus, considering that we start using the msgbus from a clean state, > > > maybe these extra steps are not necessary? Have you tried adding a > > > call to > > > intel_cx0_bus_reset() as part of intel_cx0_phy_transaction_begin()? > > That I haven't try to reset bus at the stage. I can give it a go and > > check what happens. To me it seems, that we are sometimes stuck at > > waiting the ack and eventually we time out and bail out. I have no > > clue why this happens from time to time. We already reset the bus after every read/write operation. In addition, a small delay > seem to help and clear the sporadic read/write failures to the bus. However, this is more like a workaround and I'm not really > happy about this sort of hack. > > > > I will give a go for your suggestion and come back once I have the results. > > I ran a test with intel_cx0_bus_reset() when placed in intel_cx0_phy_transaction_begin(). This didn't help either as with kms_flip > IGT test I was able to trigger this read failure > > [drm] *ERROR* PHY G Read 0d80 failed after 3 retries. > > This was with the configuration where the test vehicle had 4K and eDP monitors connected. I think we can ignore this patch. I was able to trigger this error with this patch applied as well. This doesn't fix the issue either. Sorry for the noise. -Mika- > > > > > > Thanks! > > -Mika- > > > > > > > > Also, I think it would be good if we understood better were those uncleared bits are coming from... > > > > > > -- > > > Gustavo Sousa > > > > > > > intel_de_write(i915, XELPDP_PORT_M2P_MSGBUS_CTL(port, lane), > > > > XELPDP_PORT_M2P_TRANSACTION_PENDING | > > > > (committed ? XELPDP_PORT_M2P_COMMAND_WRITE_COMMITTED : > > > >-- > > > >2.34.1 > > > >