> -----Original Message----- > From: Sousa, Gustavo <gustavo.sousa@xxxxxxxxx> > Sent: Wednesday, August 30, 2023 3:19 PM > To: Kahola, Mika <mika.kahola@xxxxxxxxx>; Sripada, Radhakrishna <radhakrishna.sripada@xxxxxxxxx>; intel- > gfx@xxxxxxxxxxxxxxxxxxxxx > Subject: RE: [PATCH] drm/i915/cx0: Check and increase msgbus timeout threshold > > Quoting Gustavo Sousa (2023-08-29 08:45:41-03:00) > >Quoting Kahola, Mika (2023-08-29 06:35:17-03:00) > >>> -----Original Message----- > >>> From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf > >>> Of Sripada, Radhakrishna > >>> Sent: Tuesday, August 29, 2023 1:54 AM > >>> To: Sousa, Gustavo <gustavo.sousa@xxxxxxxxx>; > >>> intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >>> Subject: Re: [PATCH] drm/i915/cx0: Check and increase > >>> msgbus timeout threshold > >>> > >>> Hi Gustavo, > >>> > >>> > -----Original Message----- > >>> > From: Sousa, Gustavo <gustavo.sousa@xxxxxxxxx> > >>> > Sent: Monday, August 28, 2023 2:46 PM > >>> > To: Sripada, Radhakrishna <radhakrishna.sripada@xxxxxxxxx>; intel- > >>> > gfx@xxxxxxxxxxxxxxxxxxxxx > >>> > Subject: RE: [PATCH] drm/i915/cx0: Check and increase > >>> > msgbus timeout threshold > >>> > > >>> > Quoting Sripada, Radhakrishna (2023-08-28 17:30:19-03:00) > >>> > >Hi Gustavo, > >>> > > >>> > Hi, RK. > >>> > > >>> > Thanks for the feedback! Please, see my replies below. > >>> > > >>> > > > >>> > >> -----Original Message----- > >>> > >> From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On > >>> > >> Behalf Of > >>> > Gustavo > >>> > >> Sousa > >>> > >> Sent: Monday, August 28, 2023 10:34 AM > >>> > >> To: intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >>> > >> Subject: [PATCH] drm/i915/cx0: Check and increase > >>> > >> msgbus > >>> > timeout > >>> > >> threshold > >>> > >> > >>> > >> We have experienced timeout issues when accessing C20 SRAM registers. > >>> > >This wording is misleading. It seems to imply that we directly > >>> > >access SRAM registers via msg bus but the reads/writes go through > >>> > >intermediate registers and it is this read/write that is failing. Adding extra clarity here would be helpful. > >>> > > >>> > Hm... Okay. I thought that would already be implied to someone > >>> > familiar with the code. What about: > >>> > > >>> > s/when accessing/when going through the sequence to access/ > >>> This is better to indicate that it is not direct access. > >>> > >>> > > >>> > ? > >>> > > >>> > > > >>> > >> Experimentation showed that bumping the message bus timer > >>> > >> threshold helped on getting display Type-C connection on the C20 PHY to work. > >>> > >> > >>> > >> While the timeout is still under investigation with the HW > >>> > >> team, having logic to allow forward progress (with the proper warnings) seems useful. > >>> > >> Thus, let's bump the threshold when a timeout is detected. > >>> > >> > >>> > >> The maximum value of 0x200 pclk cycles was somewhat arbitrary - > >>> > >> 2x the default value. That value was successfully tested on > >>> > >> real hardware that was displaying timeouts otherwise. > >>> > >> > >>> > >> BSpec: 65156 > >>> > >> Signed-off-by: Gustavo Sousa <gustavo.sousa@xxxxxxxxx> > >>> > >> --- > >>> > >> drivers/gpu/drm/i915/display/intel_cx0_phy.c | 41 > >>> > >> +++++++++++++++++++ > >>> > >> +++++++++++++++++++ .../gpu/drm/i915/display/intel_cx0_phy_regs > >>> > >> +++++++++++++++++++ .h > >>> > >> | 12 ++++++ > >>> > >> 2 files changed, 53 insertions(+) > >>> > >> > >>> > >> diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy.c > >>> > >> b/drivers/gpu/drm/i915/display/intel_cx0_phy.c > >>> > >> index dd489b50ad60..55d2333032e3 100644 > >>> > >> --- a/drivers/gpu/drm/i915/display/intel_cx0_phy.c > >>> > >> +++ b/drivers/gpu/drm/i915/display/intel_cx0_phy.c > >>> > >> @@ -5,6 +5,7 @@ > >>> > >> > >>> > >> #include <linux/log2.h> > >>> > >> #include <linux/math64.h> > >>> > >> +#include <linux/minmax.h> > >>> > >> #include "i915_reg.h" > >>> > >> #include "intel_cx0_phy.h" > >>> > >> #include "intel_cx0_phy_regs.h" > >>> > >> @@ -29,6 +30,8 @@ > >>> > >> #define INTEL_CX0_LANE1 BIT(1) > >>> > >> #define INTEL_CX0_BOTH_LANES (INTEL_CX0_LANE1 | > >>> > >> INTEL_CX0_LANE0) > >>> > >> > >>> > >> +#define INTEL_CX0_MSGBUS_TIMER_VAL_MAX 0x200 > >>> > >> + > >>> > >> bool intel_is_c10phy(struct drm_i915_private *i915, enum phy > >>> > >> phy) { > >>> > >> if (DISPLAY_VER_FULL(i915) == IP_VER(14, 0) && phy < > >>> > >> PHY_C) @@ -119,6 +122,43 @@ static void > >>> > >> intel_cx0_bus_reset(struct > >>> > drm_i915_private > >>> > >> *i915, enum port port, i > >>> > >> intel_clear_response_ready_flag(i915, port, lane); } > >>> > >> > >>> > >> +/* > >>> > >> + * Check if there was a timeout detected by the hardware in > >>> > >> +the message bus > >>> > >> + * and bump the threshold if so. > >> > >>Just a thought but wouldn't it be simpler if we just set the timeout to it's maximum and discard the check here? > > > >I'm not sure which exactly check you are talking about here: > > > > 1) The check on the XELPDP_PORT_MSGBUS_TIMER_TIMED_OUT. > > > > I think this could give us useful debugging info, for example, if our code > > thinks the message bus timed out while the bus hardware did not say so. I > > would prefer to keep this one, if you are okay with it. > > > > 2) The check on the current value of the threshold and the exponential > > increase up to the maximum. > > > > I would agree that I did a bit of over-engineering here. I guess I could > > simply to a rmw using INTEL_CX0_MSGBUS_TIMER_VAL_MAX directly as you > > suggested; and maybe rename INTEL_CX0_MSGBUS_TIMER_VAL_MAX to > > INTEL_CX0_MSGBUS_TIMER_BUMPED_VAL, just not to give the impression that > > 0x200 is the maximum accepted by the hardware. > > > >What do you think? > > I've sent a v2 addressing (2). Yes, you're right, we would benefit from this timeout check. I think your approach to directly bump up timeout to it's max is a way to go. I will have a look at the v2 patch. -Mika- > > -- > Gustavo Sousa > > > > >-- > >Gustavo Sousa > > > >> > >>-Mika- > >> > >>> > >> + */ > >>> > >> +static void intel_cx0_bus_check_and_bump_timer(struct > >>> > >> +drm_i915_private > >>> > >> *i915, > >>> > >> + enum port port, > >>> > >> +int > >>> > >> +lane) { > >>> > >> + enum phy phy = intel_port_to_phy(i915, port); > >>> > >> + i915_reg_t reg; > >>> > >> + u32 val; > >>> > >> + u32 timer_val; > >>> > >> + > >>> > >> + reg = XELPDP_PORT_MSGBUS_TIMER(port, lane); > >>> > >> + val = intel_de_read(i915, reg); > >>> > >> + > >>> > >> + if (!(val & XELPDP_PORT_MSGBUS_TIMER_TIMED_OUT)) { > >>> > >> + drm_warn(&i915->drm, > >>> > >> + "PHY %c lane %d: hardware did not > >>> > >> + detect a > >>> > >> timeout\n", > >>> > >> + phy_name(phy), lane); > >>> > >> + return; > >>> > >> + } > >>> > >> + > >>> > >> + timer_val = > >>> > >> REG_FIELD_GET(XELPDP_PORT_MSGBUS_TIMER_VAL_MASK, val); > >>> > >> + > >>> > >> + if (timer_val == INTEL_CX0_MSGBUS_TIMER_VAL_MAX) > >>> > >> + return; > >>> > >> + > >>> > >> + timer_val = min(2 * timer_val, > >>> > >> (u32)INTEL_CX0_MSGBUS_TIMER_VAL_MAX); > >>> > > ^ is this cast necessary? > >>> > > >>> > Yes. I remember getting a warning without it. Let me remove it to remember... > >>> Got it thanks for the quick check. > >>> > >>> > > >>> > ...done! I think it complains because the arguments are of different types: > >>> > > >>> > In file included from drivers/gpu/drm/i915/display/intel_cx0_phy.c:8: > >>> > drivers/gpu/drm/i915/display/intel_cx0_phy.c: In function > >>> > ‘intel_cx0_bus_check_and_bump_timer’: > >>> > ./include/linux/minmax.h:20:35: error: comparison of distinct > >>> > pointer types lacks a cast [-Werror] > >>> > 20 | (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) > >>> > | ^~ > >>> > ./include/linux/minmax.h:26:18: note: in expansion of macro ‘__typecheck’ > >>> > 26 | (__typecheck(x, y) && __no_side_effects(x, y)) > >>> > | ^~~~~~~~~~~ > >>> > ./include/linux/minmax.h:36:31: note: in expansion of macro ‘__safe_cmp’ > >>> > 36 | __builtin_choose_expr(__safe_cmp(x, y), \ > >>> > | ^~~~~~~~~~ > >>> > ./include/linux/minmax.h:67:25: note: in expansion of macro ‘__careful_cmp’ > >>> > 67 | #define min(x, y) __careful_cmp(x, y, <) > >>> > | ^~~~~~~~~~~~~ > >>> > drivers/gpu/drm/i915/display/intel_cx0_phy.c:152:21: note: in > >>> > expansion of macro ‘min’ > >>> > 152 | timer_val = min(2 * timer_val, > >>> > INTEL_CX0_MSGBUS_TIMER_VAL_MAX); > >>> > | > >>> > > >>> > > > >>> > >> + val &= ~XELPDP_PORT_MSGBUS_TIMER_VAL_MASK; > >>> > >> + val |= > >>> > >> + REG_FIELD_PREP(XELPDP_PORT_MSGBUS_TIMER_VAL_MASK, > >>> > >> timer_val); > >>> > >We do not use REG_FIELD_PREP in the function. Instead we use it > >>> > >in register definition in intel_cx0_phy_regs.h. > >>> > > >>> > I think it makes sense have definitions using REG_FIELD_PREP() in > >>> > header files and use those definitions in .c files for fields > >>> > whose values are are enumerated. > >>> > > >>> > Now, for fields without enumeration, like this one in discussion, > >>> > I think it is fine to use REG_FIELD_PREP() directly. The MASK is already exposed anyway... > >>> > > >>> > Besides, it seems we already have lines of code in *.c files using > >>> > the pattern > >>> > above: > >>> > > >>> > git grep REG_FIELD_PREP drm-tip/drm-tip -- ':/drivers/gpu/drm/i915/**/*.c' > >>> > > >>> > Let me know what you think. I could add a > >>> > XELPDP_PORT_MSGBUS_TIMER_VAL() macro if you think it is necessary. > >>> > My personal take is that using REG_FIELD_PREP() for this case is fine. > >>> Let us conform with the norms for the macro-fields used in this file instead of starting a new pattern. > >>> > >>> --Radhakrishna(RK) Sripada > >>> > > >>> > -- > >>> > Gustavo Sousa > >>> > > >>> > > > >>> > >Thanks, > >>> > >Radhakrishna Sripada > >>> > > > >>> > >> + > >>> > >> + drm_dbg_kms(&i915->drm, > >>> > >> + "PHY %c lane %d: increasing msgbus timer > >>> > >> + threshold to > >>> > >> %#x\n", > >>> > >> + phy_name(phy), lane, timer_val); > >>> > >> + intel_de_write(i915, reg, val); } > >>> > >> + > >>> > >> static int intel_cx0_wait_for_ack(struct drm_i915_private > >>> > >> *i915, enum port > >>> > port, > >>> > >> int command, int lane, u32 > >>> > >> *val) { @@ -132,6 +172,7 @@ static int > >>> > >> intel_cx0_wait_for_ack(struct drm_i915_private *i915, enum port > >>> > >> port, > >>> > >> > >>> > >> XELPDP_MSGBUS_TIMEOUT_SLOW, > >>> > >> val)) { > >>> > >> drm_dbg_kms(&i915->drm, "PHY %c Timeout > >>> > >> waiting for message ACK. Status: 0x%x\n", > >>> > >> phy_name(phy), *val); > >>> > >> + intel_cx0_bus_check_and_bump_timer(i915, port, > >>> > >> + lane); > >>> > >> intel_cx0_bus_reset(i915, port, lane); > >>> > >> return -ETIMEDOUT; > >>> > >> } > >>> > >> diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h > >>> > >> b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h > >>> > >> index cb5d1be2ba19..17cc3385aabb 100644 > >>> > >> --- a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h > >>> > >> +++ b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h > >>> > >> @@ -110,6 +110,18 @@ > >>> > >> #define CX0_P4PG_STATE_DISABLE 0xC > >>> > >> #define CX0_P2_STATE_RESET 0x2 > >>> > >> > >>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_A > >>> > >> 0x640d8 > >>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_B > >>> > >> 0x641d8 > >>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC1 0x16f258 > >>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC2 0x16f458 > >>> > >> +#define XELPDP_PORT_MSGBUS_TIMER(port, lane) > >>> > >> _MMIO(_PICK_EVEN_2RANGES(port, PORT_TC1, \ > >>> > >> + > >>> > >> _XELPDP_PORT_MSGBUS_TIMER_LN0_A, \ > >>> > >> + > >>> > >> _XELPDP_PORT_MSGBUS_TIMER_LN0_B, \ > >>> > >> + > >>> > >> _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC1, \ > >>> > >> + > >>> > >> _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC2) + (lane) * 4) > >>> > >> +#define XELPDP_PORT_MSGBUS_TIMER_TIMED_OUT REG_BIT(31) > >>> > >> +#define XELPDP_PORT_MSGBUS_TIMER_VAL_MASK > >>> > >> REG_GENMASK(23, 0) > >>> > >> + > >>> > >> #define _XELPDP_PORT_CLOCK_CTL_A 0x640E0 > >>> > >> #define _XELPDP_PORT_CLOCK_CTL_B 0x641E0 > >>> > >> #define _XELPDP_PORT_CLOCK_CTL_USBC1 0x16F260 > >>> > >> -- > >>> > >> 2.41.0 > >>> > >