Re: Libatomic 16B

Satish Vasudeva via Gcc-help <gcc-help@xxxxxxxxxxx> · Thu, 24 Feb 2022 08:42:56 -0800

I looked into this further. Seems like libat_load_16_i1 is implementing the
load 16B as "*lock* *cmpxchg16b* (%*rdi*)"
This is assuming that the CPU doesn't support 16B loads in a single
transaction. How can I compile libatomics to use intrinsics for load 16B
instead of LOCK cmpxchg?

Appreciate your response.

Satish

On Wed, Feb 23, 2022 at 8:42 AM Satish Vasudeva <
satish.vasudeva@xxxxxxxxxxxx> wrote:

> Hi Team,
>
> I was looking at the hotspots in our software stack and interestingly I
> see libat_load_16_i1 seems to be one of the top in the list.
>
> I am trying to understand why that is the case. My suspicion is some kind
> of lock usage for 16B atomic accesses.
>
> I came across this discussion but frankly I am still confused.
> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-01/msg02344.html
>
> Do you think the overhead of libat_load_16_i1 is due to spinlock usage?
> Also reading some other Intel CPU docs, it seems like the CPU does support
> loading 16B in single access. In that case can we optimize this for
> performance?
>
> Thanks and appreciate your help.
>
> Satish
>