On Tue, Jul 4, 2023 at 9:28 AM Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> wrote: > > On Mon, 03 Jul 2023, Uros Bizjak <ubizjak@xxxxxxxxx> wrote: > > Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old > > in i915_pmu_event_read. x86 CMPXCHG instruction returns success in ZF flag, > > so this change saves a compare after cmpxchg (and related move instruction > > in front of cmpxchg). > > > > Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg > > fails. There is no need to re-read the value in the loop. > > > > No functional change intended. > > > > Cc: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> > > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > > Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > > Cc: David Airlie <airlied@xxxxxxxxx> > > Cc: Daniel Vetter <daniel@xxxxxxxx> > > Signed-off-by: Uros Bizjak <ubizjak@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_pmu.c | 9 ++++----- > > 1 file changed, 4 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c > > index d35973b41186..108b675088ba 100644 > > --- a/drivers/gpu/drm/i915/i915_pmu.c > > +++ b/drivers/gpu/drm/i915/i915_pmu.c > > @@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event) > > event->hw.state = PERF_HES_STOPPED; > > return; > > } > > -again: > > - prev = local64_read(&hwc->prev_count); > > - new = __i915_pmu_event_read(event); > > > > - if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev) > > - goto again; > > + prev = local64_read(&hwc->prev_count); > > + do { > > + new = __i915_pmu_event_read(event); > > + } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new)); > > You could save everyone a lot of time by actually documenting what these > functions do. Assume you don't know what local64_try_cmpxchg() does, and > see how many calls you have to go through to figure it out. These functions are documented in Documentation/atomic_t.txt (under "RMW ops:" section), and the difference is explained in a separate section "CMPXCHG vs TRY_CMPXCGS" in the same file. Uros. > Because the next time I encounter this code or a patch like this, I'm > probably going to have to do that again. > > To me, the old one was more readable. The optimization is meaningless to > me if it's not quantified but reduces readability. > > > BR, > Jani. > > > > > > local64_add(new - prev, &event->count); > > } > > -- > Jani Nikula, Intel Open Source Graphics Center