On Mon, 15 May 2023 03:10:56 -0700, Tvrtko Ursulin wrote: > Hi Tvrtko, > On 12/05/2023 21:57, Umesh Nerlige Ramappa wrote: > > On Fri, May 12, 2023 at 11:56:18AM +0100, Tvrtko Ursulin wrote: > >> > >> On 12/05/2023 02:08, Dixit, Ashutosh wrote: > >>> On Fri, 05 May 2023 17:58:15 -0700, Umesh Nerlige Ramappa wrote: > >>>> > >>>> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>>> > >>>> Reserve some bits in the counter config namespace which will carry the > >>>> tile id and prepare the code to handle this. > >>>> > >>>> No per tile counters have been added yet. > >>>> > >>>> v2: > >>>> - Fix checkpatch issues > >>>> - Use 4 bits for gt id in non-engine counters. Drop FIXME. > >>>> - Set MAX GTs to 4. Drop FIXME. > >>>> > >>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@xxxxxxxxx> > > 8< > > >>>> +static u64 frequency_enabled_mask(void) > >>>> +{ > >>>> + unsigned int i; > >>>> + u64 mask = 0; > >>>> + > >>>> + for (i = 0; i < I915_PMU_MAX_GTS; i++) > >>>> + mask |= config_mask(__I915_PMU_ACTUAL_FREQUENCY(i)) | > >>>> + config_mask(__I915_PMU_REQUESTED_FREQUENCY(i)); > >>>> + > >>>> + return mask; > >>>> +} > >>>> + > >>>> static bool pmu_needs_timer(struct i915_pmu *pmu, bool gpu_active) > >>>> { > >>>> struct drm_i915_private *i915 = container_of(pmu, typeof(*i915), > >>>> pmu); > >>>> @@ -120,9 +147,7 @@ static bool pmu_needs_timer(struct i915_pmu *pmu, > >>>> bool gpu_active) > >>>> * Mask out all the ones which do not need the timer, or in > >>>> * other words keep all the ones that could need the timer. > >>>> */ > >>>> - enable &= config_mask(I915_PMU_ACTUAL_FREQUENCY) | > >>>> - config_mask(I915_PMU_REQUESTED_FREQUENCY) | > >>>> - ENGINE_SAMPLE_MASK; > >>>> + enable &= frequency_enabled_mask() | ENGINE_SAMPLE_MASK; > >>> > >>> u32 enable & u64 frequency_enabled_mask > >>> > >>> ugly but ok I guess? Or change enable to u64? > > > > making pmu->enable u64 as well as other places where it is assigned to > > local variables. > > > >> > >> Hmm.. yes very ugly. Could have been an accident which happened to work > >> because there is a single timer (not per tile). > > > > Happened to work because the frequency mask does not spill over to the > > upper 32 bits (even for multi tile). > > > > --------------------- START_SECTION ---------------- > >> > >> Similar issue in frequency_sampling_enabled too. Gt_id argument to it > >> seems pointless. > > > > Not sure why it's pointless. We need the gt_id to determine the right > > mask for that specific gt. If it's not enabled, then we just return > > without pm_get and async put (like you mention later). > > And this piece of code is called within for_each_gt. > > I think I got a little confused cross referencing the code and patches last > week and did not mentally see all the changes. > > Because the hunk in other_bit() is correctly adding support for per gt bits. > > The layout of pmu->enable ends up like this: > > bits 0 - 2: engine events > bits 3 - 5: gt0 other events > bits 6 - 8: gt1 other events > bits 9 - 11: gt2 other events > bits 12 - 14: gt3 other events Correct. > > >> So I now think whole frequency_enabled_mask() is just pointless and > >> should be removed. And then pmu_needs_time code can stay as is. Possibly > >> add a config_mask_32 helper which ensures no bits in upper 32 bits are > >> returned. > >> > >> That is if we are happy for the frequency_sampling_enabled returning > >> true for all gts, regardless of which ones actually have frequency > >> sampling enabled. > >> > >> Or if we want to implement it as I probably have intended, we will need > >> to add some gt bits into pmu->enable. Maybe reserve top four same as > >> with config counters. > > > > Nope. What you have here works just fine. pmu->enable should not include > > any gt id info. gt_id[63:60] is only a concept for pmu config sent by > > user. config_mask and pmu->enable are i915 internal bookkeeping (bit > > masks) just to track what events need to be sampled. The 'other' bit > > masks are a function of gt_id because we use gt_id to calculate a > > contiguous numerical value for these 'other' events. That's all. Once the > > numerical value is calculated, there is no need for gt_id because > > config_mask is BIT_ULL(numerical_value). Since the numerical values never > > exceeded 31 (even for multi-gts), everything worked even with 32 bit > > pmu->enable. > > Yep. > > So question then is why make pmu->enable u64? The only reason was simplicity, since a lot of the existing code already assumes u64. E.g. if we keep pmu->enable u32, we should have to do the following: * Change config_mask() return type to u32 (in frequency_sampling_enabled(), we have 'pmu->enable & config_mask()') * Change frequency_enabled_mask() return type to u32 (again uses config_mask() so if we change config_mask() to u32 we change return type here too) * In i915_pmu_enable(), change 'pmu->enable |= BIT_ULL(bit)' to 'pmu->enable |= BIT(bit)' So yes, if we think we should be using pmu->enable u32, let's change this to be consistent everywhere. > Instead frequency_enabled_mask() should be made u32 since the bitwise or > composition of config_masks() is guaranteed to fit. > > At most it can have an internal u64 for the mask, assert upper_32_bits are > zero and return lower_32_bits. > > Did I get it right this time round? :) Yes, though we'd have to make the config_mask() type change above to make frequency_enabled_mask() u32. Or we can just keep everything u64. Let's decide one way or the other and close this. It seems Tvrtko is leaning towards making pmu->enable and frequency_enabled_mask() u32? Thanks. -- Ashutosh