> -----Original Message----- > From: Gupta, Anshuman > Sent: Wednesday, November 20, 2024 4:01 PM > To: Nilawar, Badal <badal.nilawar@xxxxxxxxx>; Anirban, Sk > <sk.anirban@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Poosa, Karthik <karthik.poosa@xxxxxxxxx>; Pottumuttu, Sai Teja > <sai.teja.pottumuttu@xxxxxxxxx> > Subject: RE: [PATCH v6] drm/i915/selftests: Implement frequency logging for > energy reading validation > > > > > -----Original Message----- > > From: Nilawar, Badal <badal.nilawar@xxxxxxxxx> > > Sent: Wednesday, November 20, 2024 1:44 PM > > To: Anirban, Sk <sk.anirban@xxxxxxxxx>; > > intel-gfx@xxxxxxxxxxxxxxxxxxxxx > > Cc: Gupta, Anshuman <anshuman.gupta@xxxxxxxxx>; Poosa, Karthik > > <karthik.poosa@xxxxxxxxx>; Pottumuttu, Sai Teja > > <sai.teja.pottumuttu@xxxxxxxxx> > > Subject: Re: [PATCH v6] drm/i915/selftests: Implement frequency > > logging for energy reading validation > > > > > > > > On 13-11-2024 15:20, Sk Anirban wrote: > > > Introduce RC6 & RC0 frequency logging mechanism to ensure accurate > > > energy readings aimed at addressing GPU energy leaks and power > > > measurement failures. > > > This enhancement will help ensure the accuracy of energy readings. > > > > > > v2: > > > - Improved commit message. > > > v3: > > > - Used pr_err log to display frequency (Anshuman) > > > - Sorted headers alphabetically (Sai Teja) > > > v4: > > > - Improved commit message. > > > - Fix pr_err log (Sai Teja) > > > v5: > > > - Add error & debug logging for RC0 power and frequency checks > > > (Anshuman) > > > v6: > > > - Modify debug logging for RC0 power and frequency checks (Sai > > > Teja) > > > > > > Signed-off-by: Sk Anirban <sk.anirban@xxxxxxxxx> > > > Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@xxxxxxxxx> > > > --- > > > drivers/gpu/drm/i915/gt/selftest_rc6.c | 15 +++++++++++++-- > > > 1 file changed, 13 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c > > > b/drivers/gpu/drm/i915/gt/selftest_rc6.c > > > index 1aa1446c8fb0..a8776f88d6a1 100644 > > > --- a/drivers/gpu/drm/i915/gt/selftest_rc6.c > > > +++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c > > > @@ -8,6 +8,7 @@ > > > #include "intel_gpu_commands.h" > > > #include "intel_gt_requests.h" > > > #include "intel_ring.h" > > > +#include "intel_rps.h" > > > #include "selftest_rc6.h" > > > > > > #include "selftests/i915_random.h" > > > @@ -38,6 +39,9 @@ int live_rc6_manual(void *arg) > > > ktime_t dt; > > > u64 res[2]; > > > int err = 0; > > > + u32 rc0_freq = 0; > > > + u32 rc6_freq = 0; > > > + struct intel_rps *rps = >->rps; > > > > > > /* > > > * Our claim is that we can "encourage" the GPU to enter rc6 at will. > > > @@ -66,6 +70,7 @@ int live_rc6_manual(void *arg) > > > rc0_power = librapl_energy_uJ() - rc0_power; > > > dt = ktime_sub(ktime_get(), dt); > > > res[1] = rc6_residency(rc6); > > > + rc0_freq = intel_rps_read_actual_frequency(rps); > > > if ((res[1] - res[0]) >> 10) { > > > pr_err("RC6 residency increased by %lldus while disabled for > > 1000ms!\n", > > > (res[1] - res[0]) >> 10); @@ -77,7 +82,11 @@ int > > > live_rc6_manual(void *arg) > > > rc0_power = div64_u64(NSEC_PER_SEC * rc0_power, > > > ktime_to_ns(dt)); > > > if (!rc0_power) { > > > - pr_err("No power measured while in RC0\n"); > > > + if (rc0_freq) > > > + pr_err("No power measured while in RC0! > > GPU Freq: %u in RC0\n", > > > + rc0_freq); > If rc0 frequency is there then, this has to be pr_dbg, otherwise what is the > purpose of this patch. It should return from here without any error in case there is no rc0 power but we do have rc0 frequency. Thanks, Anshuman. > > > + else > > > + pr_err("No power and freq measured while in > > RC0\n"); > > > err = -EINVAL; > > > goto out_unlock; > > > } > > > @@ -91,6 +100,7 @@ int live_rc6_manual(void *arg) > > > dt = ktime_get(); > > > rc6_power = librapl_energy_uJ(); > > > msleep(100); > > > + rc6_freq = intel_rps_read_actual_frequency(rps); > > > > I think intention of reading frequency here is to know if device was > > not in RC6 when there is failure. But for the platforms below gen12 > > reading act frequency will cause gt wake as GEN6_RPSTAT reg requires > forcewake. > > To avoid wake when device is in RC6 read actual frequency without > > applying forcewake. > If reading act_freq will wake the device, How to read frequency without > forcewake then ? > > Thanks, > Anshuaman > > > > Additionally add delay, may be delay of 1 seconds after re-enabling > > RC6 manually and forcewake flush. > > > > Regards, > > Badal > > > > > rc6_power = librapl_energy_uJ() - rc6_power; > > > dt = ktime_sub(ktime_get(), dt); > > > res[1] = rc6_residency(rc6); > > > @@ -108,7 +118,8 @@ int live_rc6_manual(void *arg) > > > pr_info("GPU consumed %llduW in RC0 and %llduW in > > RC6\n", > > > rc0_power, rc6_power); > > > if (2 * rc6_power > rc0_power) { > > > - pr_err("GPU leaked energy while in RC6!\n"); > > > + pr_err("GPU leaked energy while in RC6! GPU Freq: > > %u in RC6 and %u in RC0\n", > > > + rc6_freq, rc0_freq); > > > err = -EINVAL; > > > goto out_unlock; > > > }