Re: How to determine what is default x86-64 -march= value for a given release of gcc?

Jonathan Wakely via Gcc-help <gcc-help@xxxxxxxxxxx> · Mon, 23 Sep 2024 20:14:53 +0100

On Mon, 23 Sept 2024 at 19:21, Bradley Lucier via Gcc-help
<gcc-help@xxxxxxxxxxx> wrote:
>
> On 9/23/24 13:51, Alexander Monakov wrote:
> >
> > On Mon, 23 Sep 2024, Bradley Lucier via Gcc-help wrote:
> >
> >> So it appears that the Xeon X5460 CPU from 2008 doesn't have the rdtscp
> >> instruction, which is used by default by this version of gcc on x86-64 when no
> >> architecture is specified.
> >
> > GCC doesn't emit neither the rdtsc nor rdtscp instruction on its own, it can
> > appear only if the source code is using the corresponding built-in or
> > instrinsic; and GCC doesn't "promote" rdtsc to rdtscp depending on -march.
> > So if you see rdtscp in binary there must be rdtscp in source code. Can you
> > investigate the source corresponding to the problematic code in the binary?
>
> Thank you!  The usage is
>
> #ifdef ___CPU_x86
> #if __has_builtin(__builtin_ia32_rdtscp)

This will always be true on x86_64, even for an -march that doesn't
support the instruction. What __has_builtin tells you is that GCC
knows that built-in function, not that the instruction is available in
the current instruction set selected by -march.

> #undef ___USE___builtin_ia32_rdtscp
> #define ___USE___builtin_ia32_rdtscp

This is a reserved name, unless your library is part of the kernel or
the C library or the C compiler, you should not be defining names
starting with double underscores.

> #endif
> #endif
>
> #ifdef ___DONT_USE___builtin_ia32_rdtscp
> #undef ___USE___builtin_ia32_rdtscp
> #endif
>
> <cut>
>
> #ifndef ___GET_CPUCYCLECOUNT
> #ifdef ___USE___builtin_ia32_rdtscp
> #ifdef __GNUC__
>
> #if ___WORD_WIDTH == 64
>
> #define ___GET_CPUCYCLECOUNT ___FIX(___s64_temp)
> #define ___CPUCYCLECOUNTSTART \
> __asm__ __volatile__ ("cpuid\n\trdtsc\n\tshl $32,%%rdx\n\tor
> %%rdx,%%rax\n\tmov
> %%rax,%0\n":"=r"(___s64_temp)::"%rax","%rbx","%rcx","%rdx");
> #define ___CPUCYCLECOUNTEND \
> __asm__ __volatile__ ("rdtscp\n\tshl $32,%%rdx\n\tor %%rdx,%%rax\n\tmov
> %%rax,%0\n\tcpuid\n":"=r"(___s64_temp)::"%rax","%rbx","%rcx","%rdx");
>
> #else
>
> #define ___GET_CPUCYCLECOUNT ___FIX(___s32_temp)
> #define ___CPUCYCLECOUNTSTART \
> __asm__ __volatile__ ("cpuid\n\trdtsc\n\tmov
> %%eax,%0\n":"=r"(___s32_temp)::"%eax","%ebx","%ecx","%edx");
> #define ___CPUCYCLECOUNTEND \
> __asm__ __volatile__ ("rdtscp\n\tmov
> %%eax,%0\n\tcpuid\n":"=r"(___s32_temp)::"%eax","%ebx","%ecx","%edx");
>
> #endif
>
> #endif
> #endif
> #endif
>
> So the use seems to be guarded by
>
> __has_builtin(__builtin_ia32_rdtscp)
>
> I compiled the code on the same machine that it failed on.  So it thinks
> it has the builtin and then it doesn't, right?

It has the built-in. That doesn't mean the processor supports the instruction.

>
> I get this output:
>
>   gcc-12 -Q --help=target | grep enabled
>    -m128bit-long-double                 [enabled]
>    -m64                                 [enabled]
>    -m80387                              [enabled]
>    -malign-stringops                    [enabled]
>    -mdirect-extern-access               [enabled]
>    -mfancy-math-387                     [enabled]
>    -mfp-ret-in-387                      [enabled]
>    -mfxsr                               [enabled]
>    -mglibc                              [enabled]
>    -mhard-float                         [enabled]
>    -mieee-fp                            [enabled]
>    -mlong-double-80                     [enabled]
>    -mmmx                                [enabled]
>    -mno-sse4                            [enabled]
>    -mpush-args                          [enabled]
>    -mred-zone                           [enabled]
>    -msse                                [enabled]
>    -msse2                               [enabled]
>    -mstv                                [enabled]
>    -mtls-direct-seg-refs                [enabled]
>    -mvzeroupper                         [enabled]
>
> Is this relevant?
>
> >> If I say
> >>
> >> gcc -Q --help=target
> >>
> >> I get
> >>
> >>    -march=                                     x86-64
> >>
> >>
> >> with a lot of other options enabled or disabled.
> >>
> >> So, is there a way to tell what the default -march= value for a given version
> >> of gcc is?
> >
> > -Q --help=target gives that value.
> OK, so that gave me "x86-64".

That's the default when targeting x86_64, unless GCC was configured
with either --with-arch=haswell or --target=haswell-pc-linux-gnu, both
of which would set -march=haswell as the default.

>  Does that mean that code generated by
> gcc-12 will run on any processor on this page:
>
> https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/x86-Options.html
>
> from "nocona" on?

Some of the CPUs below there are 32-bit only, e.g. k6. It mean GCC
will generate code will run on anything "with 64-bit extensions" or
"with x86-64 instruction set support".

But the -march setting only controls which instructions GCC itself
will generate in the output. If you compile asm statements that use
rdtsc then that instruction will be in the output, and then if that
instruction is reached at runtime then the processor needs to support
it.