Re: [PATCH] drm/i915/cx0: Check and increase msgbus timeout threshold

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Gustavo Sousa (2023-08-29 08:45:41-03:00)
>Quoting Kahola, Mika (2023-08-29 06:35:17-03:00)
>>> -----Original Message-----
>>> From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Sripada, Radhakrishna
>>> Sent: Tuesday, August 29, 2023 1:54 AM
>>> To: Sousa, Gustavo <gustavo.sousa@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx
>>> Subject: Re:  [PATCH] drm/i915/cx0: Check and increase msgbus timeout threshold
>>> 
>>> Hi Gustavo,
>>> 
>>> > -----Original Message-----
>>> > From: Sousa, Gustavo <gustavo.sousa@xxxxxxxxx>
>>> > Sent: Monday, August 28, 2023 2:46 PM
>>> > To: Sripada, Radhakrishna <radhakrishna.sripada@xxxxxxxxx>; intel-
>>> > gfx@xxxxxxxxxxxxxxxxxxxxx
>>> > Subject: RE:  [PATCH] drm/i915/cx0: Check and increase
>>> > msgbus timeout threshold
>>> >
>>> > Quoting Sripada, Radhakrishna (2023-08-28 17:30:19-03:00)
>>> > >Hi Gustavo,
>>> >
>>> > Hi, RK.
>>> >
>>> > Thanks for the feedback! Please, see my replies below.
>>> >
>>> > >
>>> > >> -----Original Message-----
>>> > >> From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf
>>> > >> Of
>>> > Gustavo
>>> > >> Sousa
>>> > >> Sent: Monday, August 28, 2023 10:34 AM
>>> > >> To: intel-gfx@xxxxxxxxxxxxxxxxxxxxx
>>> > >> Subject:  [PATCH] drm/i915/cx0: Check and increase
>>> > >> msgbus
>>> > timeout
>>> > >> threshold
>>> > >>
>>> > >> We have experienced timeout issues when accessing C20 SRAM registers.
>>> > >This wording is misleading. It seems to imply that we directly access
>>> > >SRAM registers via msg bus but the reads/writes go through
>>> > >intermediate registers and it is this read/write that is failing. Adding extra clarity here would be helpful.
>>> >
>>> > Hm... Okay. I thought that would already be implied to someone
>>> > familiar with the code. What about:
>>> >
>>> >     s/when accessing/when going through the sequence to access/
>>> This is better to indicate that it is not direct access.
>>> 
>>> >
>>> > ?
>>> >
>>> > >
>>> > >> Experimentation showed that bumping the message bus timer threshold
>>> > >> helped on getting display Type-C connection on the C20 PHY to work.
>>> > >>
>>> > >> While the timeout is still under investigation with the HW team,
>>> > >> having logic to allow forward progress (with the proper warnings) seems useful.
>>> > >> Thus, let's bump the threshold when a timeout is detected.
>>> > >>
>>> > >> The maximum value of 0x200 pclk cycles was somewhat arbitrary - 2x
>>> > >> the default value. That value was successfully tested on real
>>> > >> hardware that was displaying timeouts otherwise.
>>> > >>
>>> > >> BSpec: 65156
>>> > >> Signed-off-by: Gustavo Sousa <gustavo.sousa@xxxxxxxxx>
>>> > >> ---
>>> > >>  drivers/gpu/drm/i915/display/intel_cx0_phy.c  | 41
>>> > >> +++++++++++++++++++  .../gpu/drm/i915/display/intel_cx0_phy_regs.h
>>> > >> | 12 ++++++
>>> > >>  2 files changed, 53 insertions(+)
>>> > >>
>>> > >> diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy.c
>>> > >> b/drivers/gpu/drm/i915/display/intel_cx0_phy.c
>>> > >> index dd489b50ad60..55d2333032e3 100644
>>> > >> --- a/drivers/gpu/drm/i915/display/intel_cx0_phy.c
>>> > >> +++ b/drivers/gpu/drm/i915/display/intel_cx0_phy.c
>>> > >> @@ -5,6 +5,7 @@
>>> > >>
>>> > >>  #include <linux/log2.h>
>>> > >>  #include <linux/math64.h>
>>> > >> +#include <linux/minmax.h>
>>> > >>  #include "i915_reg.h"
>>> > >>  #include "intel_cx0_phy.h"
>>> > >>  #include "intel_cx0_phy_regs.h"
>>> > >> @@ -29,6 +30,8 @@
>>> > >>  #define INTEL_CX0_LANE1                BIT(1)
>>> > >>  #define INTEL_CX0_BOTH_LANES        (INTEL_CX0_LANE1 |
>>> > >> INTEL_CX0_LANE0)
>>> > >>
>>> > >> +#define INTEL_CX0_MSGBUS_TIMER_VAL_MAX        0x200
>>> > >> +
>>> > >>  bool intel_is_c10phy(struct drm_i915_private *i915, enum phy phy)
>>> > >> {
>>> > >>          if (DISPLAY_VER_FULL(i915) == IP_VER(14, 0) && phy <
>>> > >> PHY_C) @@ -119,6 +122,43 @@ static void intel_cx0_bus_reset(struct
>>> > drm_i915_private
>>> > >> *i915, enum port port, i
>>> > >>          intel_clear_response_ready_flag(i915, port, lane);  }
>>> > >>
>>> > >> +/*
>>> > >> + * Check if there was a timeout detected by the hardware in the
>>> > >> +message bus
>>> > >> + * and bump the threshold if so.
>>
>>Just a thought but wouldn't it be simpler if we just set the timeout to it's maximum and discard the check here?
>
>I'm not sure which exactly check you are talking about here:
>
>  1) The check on the XELPDP_PORT_MSGBUS_TIMER_TIMED_OUT.
>
>     I think this could give us useful debugging info, for example, if our code
>     thinks the message bus timed out while the bus hardware did not say so. I
>     would prefer to keep this one, if you are okay with it.
>
>  2) The check on the current value of the threshold and the exponential
>     increase up to the maximum.
>
>     I would agree that I did a bit of over-engineering here. I guess I could
>     simply to a rmw using INTEL_CX0_MSGBUS_TIMER_VAL_MAX directly as you
>     suggested; and maybe rename INTEL_CX0_MSGBUS_TIMER_VAL_MAX to
>     INTEL_CX0_MSGBUS_TIMER_BUMPED_VAL, just not to give the impression that
>     0x200 is the maximum accepted by the hardware.
>
>What do you think?

I've sent a v2 addressing (2).

--
Gustavo Sousa

>
>--
>Gustavo Sousa
>
>>
>>-Mika- 
>>
>>> > >> + */
>>> > >> +static void intel_cx0_bus_check_and_bump_timer(struct
>>> > >> +drm_i915_private
>>> > >> *i915,
>>> > >> +                                               enum port port, int
>>> > >> +lane) {
>>> > >> +        enum phy phy = intel_port_to_phy(i915, port);
>>> > >> +        i915_reg_t reg;
>>> > >> +        u32 val;
>>> > >> +        u32 timer_val;
>>> > >> +
>>> > >> +        reg = XELPDP_PORT_MSGBUS_TIMER(port, lane);
>>> > >> +        val = intel_de_read(i915, reg);
>>> > >> +
>>> > >> +        if (!(val & XELPDP_PORT_MSGBUS_TIMER_TIMED_OUT)) {
>>> > >> +                drm_warn(&i915->drm,
>>> > >> +                         "PHY %c lane %d: hardware did not detect
>>> > >> + a
>>> > >> timeout\n",
>>> > >> +                         phy_name(phy), lane);
>>> > >> +                return;
>>> > >> +        }
>>> > >> +
>>> > >> +        timer_val =
>>> > >> REG_FIELD_GET(XELPDP_PORT_MSGBUS_TIMER_VAL_MASK, val);
>>> > >> +
>>> > >> +        if (timer_val == INTEL_CX0_MSGBUS_TIMER_VAL_MAX)
>>> > >> +                return;
>>> > >> +
>>> > >> +        timer_val = min(2 * timer_val,
>>> > >> (u32)INTEL_CX0_MSGBUS_TIMER_VAL_MAX);
>>> > >     ^ is this cast necessary?
>>> >
>>> > Yes. I remember getting a warning without it. Let me remove it to remember...
>>> Got it thanks for the quick check.
>>> 
>>> >
>>> > ...done! I think it complains because the arguments are of different types:
>>> >
>>> >     In file included from drivers/gpu/drm/i915/display/intel_cx0_phy.c:8:
>>> >     drivers/gpu/drm/i915/display/intel_cx0_phy.c: In function
>>> > ‘intel_cx0_bus_check_and_bump_timer’:
>>> >     ./include/linux/minmax.h:20:35: error: comparison of distinct
>>> > pointer types lacks a cast [-Werror]
>>> >        20 |         (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
>>> >           |                                   ^~
>>> >     ./include/linux/minmax.h:26:18: note: in expansion of macro ‘__typecheck’
>>> >        26 |                 (__typecheck(x, y) && __no_side_effects(x, y))
>>> >           |                  ^~~~~~~~~~~
>>> >     ./include/linux/minmax.h:36:31: note: in expansion of macro ‘__safe_cmp’
>>> >        36 |         __builtin_choose_expr(__safe_cmp(x, y), \
>>> >           |                               ^~~~~~~~~~
>>> >     ./include/linux/minmax.h:67:25: note: in expansion of macro ‘__careful_cmp’
>>> >        67 | #define min(x, y)       __careful_cmp(x, y, <)
>>> >           |                         ^~~~~~~~~~~~~
>>> >     drivers/gpu/drm/i915/display/intel_cx0_phy.c:152:21: note: in
>>> > expansion of macro ‘min’
>>> >       152 |         timer_val = min(2 * timer_val,
>>> > INTEL_CX0_MSGBUS_TIMER_VAL_MAX);
>>> >           |
>>> >
>>> > >
>>> > >> +        val &= ~XELPDP_PORT_MSGBUS_TIMER_VAL_MASK;
>>> > >> +        val |= REG_FIELD_PREP(XELPDP_PORT_MSGBUS_TIMER_VAL_MASK,
>>> > >> timer_val);
>>> > >We do not use REG_FIELD_PREP in the function. Instead we use it in
>>> > >register definition in intel_cx0_phy_regs.h.
>>> >
>>> > I think it makes sense have definitions using REG_FIELD_PREP() in
>>> > header files and use those definitions in .c files for fields whose
>>> > values are are enumerated.
>>> >
>>> > Now, for fields without enumeration, like this one in discussion, I
>>> > think it is fine to use REG_FIELD_PREP() directly. The MASK is already exposed anyway...
>>> >
>>> > Besides, it seems we already have lines of code in *.c files using the
>>> > pattern
>>> > above:
>>> >
>>> >     git grep REG_FIELD_PREP drm-tip/drm-tip -- ':/drivers/gpu/drm/i915/**/*.c'
>>> >
>>> > Let me know what you think. I could add a
>>> > XELPDP_PORT_MSGBUS_TIMER_VAL() macro if you think it is necessary. My
>>> > personal take is that using REG_FIELD_PREP() for this case is fine.
>>> Let us conform with the norms for the macro-fields used in this file instead of starting a new pattern.
>>> 
>>> --Radhakrishna(RK) Sripada
>>> >
>>> > --
>>> > Gustavo Sousa
>>> >
>>> > >
>>> > >Thanks,
>>> > >Radhakrishna Sripada
>>> > >
>>> > >> +
>>> > >> +        drm_dbg_kms(&i915->drm,
>>> > >> +                    "PHY %c lane %d: increasing msgbus timer
>>> > >> + threshold to
>>> > >> %#x\n",
>>> > >> +                    phy_name(phy), lane, timer_val);
>>> > >> +        intel_de_write(i915, reg, val); }
>>> > >> +
>>> > >>  static int intel_cx0_wait_for_ack(struct drm_i915_private *i915,
>>> > >> enum port
>>> > port,
>>> > >>                                    int command, int lane, u32 *val)
>>> > >> { @@ -132,6 +172,7 @@ static int intel_cx0_wait_for_ack(struct
>>> > >> drm_i915_private *i915, enum port port,
>>> > >>
>>> > >> XELPDP_MSGBUS_TIMEOUT_SLOW,
>>> > >> val)) {
>>> > >>                  drm_dbg_kms(&i915->drm, "PHY %c Timeout waiting
>>> > >> for message ACK. Status: 0x%x\n",
>>> > >>                              phy_name(phy), *val);
>>> > >> +                intel_cx0_bus_check_and_bump_timer(i915, port,
>>> > >> + lane);
>>> > >>                  intel_cx0_bus_reset(i915, port, lane);
>>> > >>                  return -ETIMEDOUT;
>>> > >>          }
>>> > >> diff --git a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>>> > >> b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>>> > >> index cb5d1be2ba19..17cc3385aabb 100644
>>> > >> --- a/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>>> > >> +++ b/drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>>> > >> @@ -110,6 +110,18 @@
>>> > >>  #define   CX0_P4PG_STATE_DISABLE                        0xC
>>> > >>  #define   CX0_P2_STATE_RESET                                0x2
>>> > >>
>>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_A
>>> > >>         0x640d8
>>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_B
>>> > >>         0x641d8
>>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC1                0x16f258
>>> > >> +#define _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC2                0x16f458
>>> > >> +#define XELPDP_PORT_MSGBUS_TIMER(port, lane)
>>> > >>         _MMIO(_PICK_EVEN_2RANGES(port, PORT_TC1, \
>>> > >> +
>>> > >>          _XELPDP_PORT_MSGBUS_TIMER_LN0_A, \
>>> > >> +
>>> > >>          _XELPDP_PORT_MSGBUS_TIMER_LN0_B, \
>>> > >> +
>>> > >>          _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC1, \
>>> > >> +
>>> > >>          _XELPDP_PORT_MSGBUS_TIMER_LN0_USBC2) + (lane) * 4)
>>> > >> +#define   XELPDP_PORT_MSGBUS_TIMER_TIMED_OUT                REG_BIT(31)
>>> > >> +#define   XELPDP_PORT_MSGBUS_TIMER_VAL_MASK
>>> > >>         REG_GENMASK(23, 0)
>>> > >> +
>>> > >>  #define _XELPDP_PORT_CLOCK_CTL_A                        0x640E0
>>> > >>  #define _XELPDP_PORT_CLOCK_CTL_B                        0x641E0
>>> > >>  #define _XELPDP_PORT_CLOCK_CTL_USBC1                        0x16F260
>>> > >> --
>>> > >> 2.41.0
>>> > >




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux