On Mon, 23 Sept 2024 at 19:21, Bradley Lucier via Gcc-help <gcc-help@xxxxxxxxxxx> wrote: > > On 9/23/24 13:51, Alexander Monakov wrote: > > > > On Mon, 23 Sep 2024, Bradley Lucier via Gcc-help wrote: > > > >> So it appears that the Xeon X5460 CPU from 2008 doesn't have the rdtscp > >> instruction, which is used by default by this version of gcc on x86-64 when no > >> architecture is specified. > > > > GCC doesn't emit neither the rdtsc nor rdtscp instruction on its own, it can > > appear only if the source code is using the corresponding built-in or > > instrinsic; and GCC doesn't "promote" rdtsc to rdtscp depending on -march. > > So if you see rdtscp in binary there must be rdtscp in source code. Can you > > investigate the source corresponding to the problematic code in the binary? > > Thank you! The usage is > > #ifdef ___CPU_x86 > #if __has_builtin(__builtin_ia32_rdtscp) This will always be true on x86_64, even for an -march that doesn't support the instruction. What __has_builtin tells you is that GCC knows that built-in function, not that the instruction is available in the current instruction set selected by -march. > #undef ___USE___builtin_ia32_rdtscp > #define ___USE___builtin_ia32_rdtscp This is a reserved name, unless your library is part of the kernel or the C library or the C compiler, you should not be defining names starting with double underscores. > #endif > #endif > > #ifdef ___DONT_USE___builtin_ia32_rdtscp > #undef ___USE___builtin_ia32_rdtscp > #endif > > <cut> > > #ifndef ___GET_CPUCYCLECOUNT > #ifdef ___USE___builtin_ia32_rdtscp > #ifdef __GNUC__ > > #if ___WORD_WIDTH == 64 > > #define ___GET_CPUCYCLECOUNT ___FIX(___s64_temp) > #define ___CPUCYCLECOUNTSTART \ > __asm__ __volatile__ ("cpuid\n\trdtsc\n\tshl $32,%%rdx\n\tor > %%rdx,%%rax\n\tmov > %%rax,%0\n":"=r"(___s64_temp)::"%rax","%rbx","%rcx","%rdx"); > #define ___CPUCYCLECOUNTEND \ > __asm__ __volatile__ ("rdtscp\n\tshl $32,%%rdx\n\tor %%rdx,%%rax\n\tmov > %%rax,%0\n\tcpuid\n":"=r"(___s64_temp)::"%rax","%rbx","%rcx","%rdx"); > > #else > > #define ___GET_CPUCYCLECOUNT ___FIX(___s32_temp) > #define ___CPUCYCLECOUNTSTART \ > __asm__ __volatile__ ("cpuid\n\trdtsc\n\tmov > %%eax,%0\n":"=r"(___s32_temp)::"%eax","%ebx","%ecx","%edx"); > #define ___CPUCYCLECOUNTEND \ > __asm__ __volatile__ ("rdtscp\n\tmov > %%eax,%0\n\tcpuid\n":"=r"(___s32_temp)::"%eax","%ebx","%ecx","%edx"); > > #endif > > #endif > #endif > #endif > > So the use seems to be guarded by > > __has_builtin(__builtin_ia32_rdtscp) > > I compiled the code on the same machine that it failed on. So it thinks > it has the builtin and then it doesn't, right? It has the built-in. That doesn't mean the processor supports the instruction. > > I get this output: > > gcc-12 -Q --help=target | grep enabled > -m128bit-long-double [enabled] > -m64 [enabled] > -m80387 [enabled] > -malign-stringops [enabled] > -mdirect-extern-access [enabled] > -mfancy-math-387 [enabled] > -mfp-ret-in-387 [enabled] > -mfxsr [enabled] > -mglibc [enabled] > -mhard-float [enabled] > -mieee-fp [enabled] > -mlong-double-80 [enabled] > -mmmx [enabled] > -mno-sse4 [enabled] > -mpush-args [enabled] > -mred-zone [enabled] > -msse [enabled] > -msse2 [enabled] > -mstv [enabled] > -mtls-direct-seg-refs [enabled] > -mvzeroupper [enabled] > > Is this relevant? > > >> If I say > >> > >> gcc -Q --help=target > >> > >> I get > >> > >> -march= x86-64 > >> > >> > >> with a lot of other options enabled or disabled. > >> > >> So, is there a way to tell what the default -march= value for a given version > >> of gcc is? > > > > -Q --help=target gives that value. > OK, so that gave me "x86-64". That's the default when targeting x86_64, unless GCC was configured with either --with-arch=haswell or --target=haswell-pc-linux-gnu, both of which would set -march=haswell as the default. > Does that mean that code generated by > gcc-12 will run on any processor on this page: > > https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/x86-Options.html > > from "nocona" on? Some of the CPUs below there are 32-bit only, e.g. k6. It mean GCC will generate code will run on anything "with 64-bit extensions" or "with x86-64 instruction set support". But the -march setting only controls which instructions GCC itself will generate in the output. If you compile asm statements that use rdtsc then that instruction will be in the output, and then if that instruction is reached at runtime then the processor needs to support it.