Hi, Please bear my elaboration of the problem. There is string.h has: #define memcpy(d,s,l) __builtin_memcpy(d,s,l) #define memset(d,c,l) __builtin_memset(d,c,l) #define memcmp __builtin_memcmp and string.c has function implementation of them. In linux kernel, both arch/x86/boot/compressed/pgtable_64.c and arch/x86/boot/compressed/kaslr.c include string.h, and both .c use memcpy. But, from nm output of both .o, kaslr.o has memcpy entry, while pgtable_64.o doesn't. Apparently, for __builtin_memcpy, sometimes is optimized to inline code, and sometimes emit a call to local memcpy(). So the answer seems is: GCC has heuristic decision on how to expand the __builtin_memcpy, but the result is not fixed, it is case by case. And -mstringop-strategy=byte_loop helped me to confirm we can control the optimization behaviour. Thank you both, Jakub and Martin! This free me with headache of these two days:) -- Sincerely, Cao jin On 1/11/19 11:03 AM, Cao jin wrote: > Hi, > (pls CC me when replying because I am not subscriber) > > I met an interesting phenomenon when looking into linux kernel > compilation, it can be simply summarized as following: in > arch/x86/boot/compressed, memcpy is defined as __builtin_memcpy, while > also implemented as a function. But when using memcpy, in some case GCC > optimize it to inline code, in other case GCC just emit a call to > self-defined memcpy function. This can be confirmed according to the > symbol table via `nm bluh.o`. > > The compiling flags is, for example: > cmd_arch/x86/boot/compressed/pgtable_64.o := gcc > -Wp,-MD,arch/x86/boot/compressed/.pgtable_64.o.d -nostdinc -isystem > /usr/lib/gcc/x86_64-redhat-linux/8/include -I./arch/x86/include > -I./arch/x86/include/gene rated -I./include > -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi > -I./include/uapi -I./include/generated/uapi -include > ./include/linux/kconfig.h -include ./include/linux/compiler_types.h > -D__KERNEL__ -DCONFIG_CC_STACKPROTECTOR -m64 -O2 -fno-strict-aliasing > -fPIE -DDISABLE_BRANCH_PROFILING -mcmodel=small -mno-mmx -mno-sse > -ffreestanding -fno-stack-protector -DKBUILD_BASENAME='"pgtable_64"' > -DKBUILD_MODNAME='"pgtable_64"' -c -o > arch/x86/boot/compressed/pgtable_64.o arch/x86/boot/compressed/pgtable_64.c > > Now the questions is: from code-reading, it is kind of non-intuitive, is > there any explicit way to control the optimization behavior accurately? >