On Thu, Oct 28, 2021 at 2:22 AM Pasha Tatashin
<pasha.tatashin@xxxxxxxxxx> wrote:
I found some atomic_add/dec are replaced with atomic_add/dec_return,
I am going to replace -return variants with -fetch variants, potentially -fetch
those helpers with return value imply a full memory barrier around it, but
others without return value do not. Do you have any numbers to show
the impact? Maybe atomic_add/dec_return_relaxed can help this.
The generic variant uses arch_cmpxchg() for all atomic variants
without any extra barriers. Therefore, on platforms that use generic
implementations there won't be performance differences except for an
extra branch that checks results when VM_BUG_ON is enabled.
On x86 the difference between the two is the following
atomic_add:
lock add %eax,(%rsi)
atomic_fetch_add:
lock xadd %eax,(%rsi)
atomic_fetch_add_relaxed:
lock xadd %eax,(%rsi)
No differences between relaxed and non relaxed variants. However, we
Right. There is no difference on x86. Maybe there are differences in
other architectures.
used lock xadd instead of lock add. I am not sure if the performance
difference is going to be different.
Pasha