Re: [PATCH 2/6] drm/amd/pm: Add arcturus throttler translation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 21, 2021 at 5:32 PM Sider, Graham <Graham.Sider@xxxxxxx> wrote:
>
> Would this be referring to tools that may parse /sys/class/.../device/gpu_metrics or the actual gpu_metrics_vX_Y structs? For the latter, if there are tools that parse dependent on version vX_Y, I agree that we would not want to break those.
>
> Since most ASICs are using different version currently, we would have to create a duplicate struct for each gpu_metrics version currently being used, unless I'm misunderstanding. I'm not sure if this is what you had in mind - let me know.
>

Just update them all to the latest version.  The newer ones are just
supersets of the previous versions.  I think a newer revision just
went in in the last day or two for some additional new data, you can
probably just piggy back on that since the code is not upstream yet.

Alex


> Best,
> Graham
>
> -----Original Message-----
> From: Alex Deucher <alexdeucher@xxxxxxxxx>
> Sent: Friday, May 21, 2021 4:15 PM
> To: Sider, Graham <Graham.Sider@xxxxxxx>
> Cc: amd-gfx list <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx>; Sakhnovitch, Elena (Elen) <Elena.Sakhnovitch@xxxxxxx>
> Subject: Re: [PATCH 2/6] drm/amd/pm: Add arcturus throttler translation
>
> [CAUTION: External Email]
>
> On Fri, May 21, 2021 at 1:39 PM Sider, Graham <Graham.Sider@xxxxxxx> wrote:
> >
> > Hi Alex,
> >
> > Are you referring to bumping the gpu_metrics_vX_Y version number? Different ASICs are currently using different version numbers already, so I'm not sure how feasible this might be (e.g. arcturus ==  gpu_metrics_v1_1, navi1x == gpu_metrics_v1_3, vangogh == gpu_metrics_v2_1).
> >
> > Technically speaking no new fields have been added to any of the gpu_metrics versions, just a change in representation in the throttle_status field. Let me know your thoughts on this.
> >
>
> I don't know if we have any existing tools out there that parse this data, but if so, they would interpret it incorrectly after this change.  If we bump the version, at least the tools will know how to handle it.
>
> Alex
>
>
> > Best,
> > Graham
> >
> > -----Original Message-----
> > From: Alex Deucher <alexdeucher@xxxxxxxxx>
> > Sent: Friday, May 21, 2021 10:27 AM
> > To: Sider, Graham <Graham.Sider@xxxxxxx>
> > Cc: amd-gfx list <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Kasiviswanathan,
> > Harish <Harish.Kasiviswanathan@xxxxxxx>; Sakhnovitch, Elena (Elen)
> > <Elena.Sakhnovitch@xxxxxxx>
> > Subject: Re: [PATCH 2/6] drm/amd/pm: Add arcturus throttler
> > translation
> >
> > [CAUTION: External Email]
> >
> > General comment on the patch series, do you want to bump the metrics table version since the meaning of the throttler status has changed?
> >
> > Alex
> >
> > On Thu, May 20, 2021 at 10:30 AM Graham Sider <Graham.Sider@xxxxxxx> wrote:
> > >
> > > Perform dependent to independent throttle status translation for
> > > arcturus.
> > > ---
> > >  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 62
> > > ++++++++++++++++---
> > >  1 file changed, 53 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> > > b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> > > index 1735a96dd307..7c01c0bf2073 100644
> > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
> > > @@ -540,6 +540,49 @@ static int arcturus_freqs_in_same_level(int32_t frequency1,
> > >         return (abs(frequency1 - frequency2) <= EPSILON);  }
> > >
> > > +static uint32_t arcturus_get_indep_throttler_status(
> > > +                                       unsigned long
> > > +dep_throttler_status) {
> > > +       unsigned long indep_throttler_status = 0;
> > > +
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_EDGE_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_EDGE_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_HOTSPOT_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_HOTSPOT_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_MEM_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_MEM_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_VR_GFX_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_VR_GFX_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_VR_MEM_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_VR_MEM_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TEMP_VR_SOC_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TEMP_VR_SOC_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TDC_GFX_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TDC_GFX_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_TDC_SOC_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_TDC_SOC_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPT0_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPT0_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPT1_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPT1_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPT2_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPT2_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPT3_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPT3_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_PPM_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_PPM_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_FIT_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_FIT_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_APCC_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_APCC_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_VRHOT0_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_VRHOT0_BIT, &dep_throttler_status));
> > > +       __assign_bit(INDEP_THROTTLER_VRHOT1_BIT, &indep_throttler_status,
> > > +                 test_bit(THROTTLER_VRHOT1_BIT,
> > > + &dep_throttler_status));
> > > +
> > > +       return (uint32_t)indep_throttler_status; }
> > > +
> > >  static int arcturus_get_smu_metrics_data(struct smu_context *smu,
> > >                                          MetricsMember_t member,
> > >                                          uint32_t *value) @@ -629,7
> > > +672,7 @@ static int arcturus_get_smu_metrics_data(struct
> > > +smu_context *smu,
> > >                         SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
> > >                 break;
> > >         case METRICS_THROTTLER_STATUS:
> > > -               *value = metrics->ThrottlerStatus;
> > > +               *value =
> > > + arcturus_get_indep_throttler_status(metrics->ThrottlerStatus);
> > >                 break;
> > >         case METRICS_CURR_FANSPEED:
> > >                 *value = metrics->CurrFanSpeed; @@ -2213,13 +2256,13
> > > @@ static const struct throttling_logging_label {
> > >         uint32_t feature_mask;
> > >         const char *label;
> > >  } logging_label[] = {
> > > -       {(1U << THROTTLER_TEMP_HOTSPOT_BIT), "GPU"},
> > > -       {(1U << THROTTLER_TEMP_MEM_BIT), "HBM"},
> > > -       {(1U << THROTTLER_TEMP_VR_GFX_BIT), "VR of GFX rail"},
> > > -       {(1U << THROTTLER_TEMP_VR_MEM_BIT), "VR of HBM rail"},
> > > -       {(1U << THROTTLER_TEMP_VR_SOC_BIT), "VR of SOC rail"},
> > > -       {(1U << THROTTLER_VRHOT0_BIT), "VR0 HOT"},
> > > -       {(1U << THROTTLER_VRHOT1_BIT), "VR1 HOT"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_HOTSPOT_BIT), "GPU"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_MEM_BIT), "HBM"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_VR_GFX_BIT), "VR of GFX rail"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_VR_MEM_BIT), "VR of HBM rail"},
> > > +       {(1U << INDEP_THROTTLER_TEMP_VR_SOC_BIT), "VR of SOC rail"},
> > > +       {(1U << INDEP_THROTTLER_VRHOT0_BIT), "VR0 HOT"},
> > > +       {(1U << INDEP_THROTTLER_VRHOT1_BIT), "VR1 HOT"},
> > >  };
> > >  static void arcturus_log_thermal_throttling_event(struct
> > > smu_context
> > > *smu)  { @@ -2314,7 +2357,8 @@ static ssize_t
> > > arcturus_get_gpu_metrics(struct smu_context *smu,
> > >         gpu_metrics->current_vclk0 = metrics.CurrClock[PPCLK_VCLK];
> > >         gpu_metrics->current_dclk0 = metrics.CurrClock[PPCLK_DCLK];
> > >
> > > -       gpu_metrics->throttle_status = metrics.ThrottlerStatus;
> > > +       gpu_metrics->throttle_status =
> > > +
> > > + arcturus_get_indep_throttler_status(metrics.ThrottlerStatus);
> > >
> > >         gpu_metrics->current_fan_speed = metrics.CurrFanSpeed;
> > >
> > > --
> > > 2.17.1
> > >
> > > _______________________________________________
> > > amd-gfx mailing list
> > > amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
> > > st
> > > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7C
> > > Gr
> > > aham.Sider%40amd.com%7Ca3ca9a6b0576479e545808d91c648f50%7C3dd8961fe4
> > > 88
> > > 4e608e11a82d994e183d%7C0%7C0%7C637572040495495758%7CUnknown%7CTWFpbG
> > > Zs
> > > b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> > > 3D
> > > %7C1000&amp;sdata=YxUx7BrsQKBauKE3fHpNrkWMAG4dBy11fV9xnJdMHns%3D&amp
> > > ;r
> > > eserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux