On Mon, 16 Sep 2019 12:17:54 -0700, Umesh Nerlige Ramappa wrote: > > On Sun, Sep 15, 2019 at 02:24:41PM +0300, Lionel Landwerlin wrote: > > On 14/09/2019 02:06, Umesh Nerlige Ramappa wrote: > >> OA perf unit supports non-power of 2 report sizes. Enable support for > >> these sizes in the driver. > >> > >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@xxxxxxxxx> > >> --- > >> drivers/gpu/drm/i915/i915_perf.c | 59 ++++++++++++-------------------- > >> 1 file changed, 21 insertions(+), 38 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c > >> index 50b6d154fd46..482fca3da7de 100644 > >> --- a/drivers/gpu/drm/i915/i915_perf.c > >> +++ b/drivers/gpu/drm/i915/i915_perf.c > >> @@ -450,7 +450,7 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) > >> u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); > >> int report_size = stream->oa_buffer.format_size; > >> unsigned long flags; > >> - u32 hw_tail; > >> + u32 hw_tail, aging_tail; > >> u64 now; > >> /* We have to consider the (unlikely) possibility that read() errors > >> @@ -459,16 +459,17 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) > >> */ > >> spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags); > >> - hw_tail = dev_priv->perf.ops.oa_hw_tail_read(stream); > >> + hw_tail = dev_priv->perf.ops.oa_hw_tail_read(stream) - gtt_offset; > >> + aging_tail = stream->oa_buffer.aging_tail - gtt_offset; > >> /* The tail pointer increases in 64 byte increments, > >> * not in report_size steps... > >> */ > >> - hw_tail &= ~(report_size - 1); > >> + hw_tail = OA_TAKEN(hw_tail, (OA_TAKEN(hw_tail, aging_tail) % report_size)); > > > > > > I'm struggling to parse this line above and I'm not 100% sure it's correct. > > > > Could add a comment to explain what is going on? > > The aging tail is always pointing to the boundary of a report whereas > the hw_tail is advancing in 64 byte increments. > > The innermost OA_TAKEN is returning the number of bytes between the > hw_tail and the aging_tail. The modulo is getting the size of the > partial report (if any). > > The outermost OA_TAKEN is subtracting the size of partial report from > the hw_tail to get a hw_tail that points to the boundary of the last > full report. > > The value of hw_tail would be the same as from the deleted line of code > above this line. Looks like aging_tail should not be needed (it is complicating the expression and is not there in the original expression). All we need is a "circular modulus". For example would the following work? if (hw_tail < report_size) hw_tail += OA_BUFFER_SIZE; hw_tail = rounddown(hw_tail, report_size); Another way (if we want to avoid the division) would be to maintain a cached copy of hw_tail in SW which is successively (and circularly) incremented by report_size till it catches up with hw_tail read from HW. But probably the first method above is simpler. _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx