On Tue, Mar 28, 2023 at 12:34 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > > On Tue, Mar 28, 2023 at 12:28 PM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > > > On Tue, Mar 28, 2023 at 11:53 AM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > > > > > [...] > > > > > + if (atomic_xchg(&stats_flush_ongoing, 1)) > > > > > > > > Have you profiled this? I wonder if we should replace the above with > > > > > > > > if (atomic_read(&stats_flush_ongoing) || atomic_xchg(&stats_flush_ongoing, 1)) > > > > > > I profiled the entire series with perf and I haven't noticed a notable > > > difference between before and after the patch series -- but maybe some > > > specific access patterns cause a regression, not sure. > > > > > > Does an atomic_cmpxchg() satisfy the same purpose? it's easier to read > > > / more concise I guess. > > > > > > Something like > > > > > > if (atomic_cmpxchg(&stats_flush_ongoing, 0, 1)) > > > > > > WDYT? > > > > > > > No, I don't think cmpxchg will be any different from xchg(). On x86, > > the cmpxchg will always write to stats_flush_ongoing and depending on > > the comparison result, it will either be 0 or 1 here. > > > > If you see the implementation of queued_spin_trylock(), it does the > > same as well. > > Interesting. I thought cmpxchg by definition will compare first and > only do the write if stats_flush_ongoing == 0 in this case. > > I thought queued_spin_trylock() was doing an atomic_read() first to > avoid the LOCK instruction unnecessarily the lock is held by someone > else. Anyway, perhaps it's better to follow what queued_spin_trylock() is doing, even if only to avoid locking the cache line unnecessarily. (Although now that I think about it, I wonder why atomic_cmpxchg doesn't do this by default, food for thought)