On Mon, 03 Oct 2022 01:11:21 -0700, Tvrtko Ursulin wrote: > On 03/10/2022 06:34, Dixit, Ashutosh wrote: > > On Tue, 27 Sep 2022 07:17:23 -0700, Tvrtko Ursulin wrote: > > > > Hi Tvrtko, > > > > I am adding some people who may have more background/history into this. > > > >> On 10/09/2022 15:38, Ashutosh Dixit wrote: > >>> From: Tilak Tangudu <tilak.tangudu@xxxxxxxxx> > >>> > >>> Add perf_limit_reasons in debugfs. The upper 16 perf_limit_reasons RW "log" > >>> bits are identical to the lower 16 RO "status" bits except that the "log" > >>> bits remain set until cleared, thereby ensuring the throttling occurrence > >>> is not missed. The clear fop clears the upper 16 "log" bits, the get fop > >>> gets all 32 "log" and "status" bits. > >>> > >>> v2: Expand commit message and clarify "log" and "status" bits in > >>> comment (Rodrigo) > >>> > >>> Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > >>> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@xxxxxxxxx> > >>> Signed-off-by: Tilak Tangudu <tilak.tangudu@xxxxxxxxx> > >>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > >>> --- > >>> drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c | 31 +++++++++++++++++++ > >>> drivers/gpu/drm/i915/i915_reg.h | 1 + > >>> 2 files changed, 32 insertions(+) > >>> > >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c > >>> index 108b9e76c32e..a009cf69103a 100644 > >>> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c > >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c > >>> @@ -655,6 +655,36 @@ static bool rps_eval(void *data) > >>> DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rps_boost); > >>> +static int perf_limit_reasons_get(void *data, u64 *val) > >>> +{ > >>> + struct intel_gt *gt = data; > >>> + intel_wakeref_t wakeref; > >>> + > >>> + with_intel_runtime_pm(gt->uncore->rpm, wakeref) > >>> + *val = intel_uncore_read(gt->uncore, GT0_PERF_LIMIT_REASONS); > >> > >> I have a bunch of questions, given failure to apply this to > >> drm-intel-next-fixes brought my attention to it. > >> > >> Why is GT0_PERF_LIMIT_REASONS_MASK not applied here? > > > > To expose all 32 "log" and "status" bits, since no log bits and only a few > > status bits are available in sysfs. > > > >> Who resets the low bits and when in normal operation? (I mean not via this > >> debugfs clear.) > > > > HW would reset the low status bits after the condition causing the > > throttling goes away. That condition would persist in the the upper log > > bits till cleared via debugfs. > > > >> Why do we have sysfs expose some of this register, but not these high log > >> bits? Which are more important to the user? > > > > The low status bits should be sampled while the throttling condition is > > "current" (still happening). The upper log bits can be looked at later to > > see if a particular condition happened. > > > > I would say as long user land can sample (say every 5 ms) the low status > > bits while a workload is running they can get a complete picture of what > > throttling happened when (the app could sample the actual_freq sysfs and > > correlate those values with the perf_limit_reasons sysfs e.g.). > > > >> What is the use case for allowing clearing of the log bits from debugfs? > >> Why do we want to carry that code? For IGT? > > > > I guess the reason for exposing/clearing the log bits is that they can be > > useful in cases where userspace does not want to sample the status bits > > continuously while a workload is running. They can just come afterwards and > > say, ah, these are the reasons limiting freq while the workload was > > running. > > But as you say hardware clears them, then it is questionable it would show > anything at that point? The status bits would be cleared by HW but any events which have occured would be coalesced and persist in the log bits. > Do we have interrupts associated with throttling and associated log message > or uevents? I believe there are punit interrupts and some work-items to hook these say into HWMON "alarms" but pretty sure none of this is implemented because no clear customer requirements at this point. > > Also the log bits could be used in conjunction with the low status bits to > > infer if any limiting event got missed during sampling the status bits. > > > > I believe IGT's correlating the actual_freq with perf_limit_reasons are > > planned. But yes, whether the upper log bits should also be exposed in > > sysfs is a valid one and should be discussed. > > Thanks for the clarifications! > > Yes, it sounds like some discussion is needed as to intended use cases for > exposing and clearing the log. > > TBH I am not sure how urgent it is - will there be further conflicts caused > by divergence of 7738be973fc4e2ba22154fafd3a5d7b9666f9abf vs > 0d2d201095e9f141d6a9fb44320afce761f8b5c2. If there would be it would > probably be good to close on this question within the current development > cycle. I believe this is already resolved by the merge of drm-intel-fixes into drm-tip. Thanks. -- Ashutosh > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +static int perf_limit_reasons_clear(void *data, u64 val) > >>> +{ > >>> + struct intel_gt *gt = data; > >>> + intel_wakeref_t wakeref; > >>> + > >>> + /* > >>> + * Clear the upper 16 "log" bits, the lower 16 "status" bits are > >>> + * read-only. The upper 16 "log" bits are identical to the lower 16 > >>> + * "status" bits except that the "log" bits remain set until cleared. > >>> + */ > >>> + with_intel_runtime_pm(gt->uncore->rpm, wakeref) > >>> + intel_uncore_rmw(gt->uncore, GT0_PERF_LIMIT_REASONS, > >>> + GT0_PERF_LIMIT_REASONS_LOG_MASK, 0); > >>> + > >>> + return 0; > >>> +} > >>> +DEFINE_SIMPLE_ATTRIBUTE(perf_limit_reasons_fops, perf_limit_reasons_get, > >>> + perf_limit_reasons_clear, "%llu\n"); > >>> + > >>> void intel_gt_pm_debugfs_register(struct intel_gt *gt, struct dentry *root) > >>> { > >>> static const struct intel_gt_debugfs_file files[] = { > >>> @@ -664,6 +694,7 @@ void intel_gt_pm_debugfs_register(struct intel_gt *gt, struct dentry *root) > >>> { "forcewake_user", &forcewake_user_fops, NULL}, > >>> { "llc", &llc_fops, llc_eval }, > >>> { "rps_boost", &rps_boost_fops, rps_eval }, > >>> + { "perf_limit_reasons", &perf_limit_reasons_fops, NULL }, > >>> }; > >>> intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), > >>> gt); > >>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > >>> index 52462cbfdc66..58b0ed9dddd5 100644 > >>> --- a/drivers/gpu/drm/i915/i915_reg.h > >>> +++ b/drivers/gpu/drm/i915/i915_reg.h > >>> @@ -1802,6 +1802,7 @@ > >>> #define POWER_LIMIT_4_MASK REG_BIT(8) > >>> #define POWER_LIMIT_1_MASK REG_BIT(10) > >>> #define POWER_LIMIT_2_MASK REG_BIT(11) > >>> +#define GT0_PERF_LIMIT_REASONS_LOG_MASK REG_GENMASK(31, 16) > >>> #define CHV_CLK_CTL1 _MMIO(0x101100) > >>> #define VLV_CLK_CTL2 _MMIO(0x101104)