Sam Ravnborg a ?crit : > On Thu, May 07, 2009 at 03:26:49PM -0700, H. Peter Anvin wrote: >> From: H. Peter Anvin <hpa at zytor.com> >> >> Aligning the .bss section makes it trivially faster, and makes using >> larger transfers for the clear slightly easier. >> >> [ Impact: trivial performance enhancement, future patch prep ] >> >> Signed-off-by: H. Peter Anvin <hpa at zytor.com> >> --- >> arch/x86/boot/compressed/vmlinux.lds.S | 1 + >> 1 files changed, 1 insertions(+), 0 deletions(-) >> >> diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S >> index 0d26c92..27c168d 100644 >> --- a/arch/x86/boot/compressed/vmlinux.lds.S >> +++ b/arch/x86/boot/compressed/vmlinux.lds.S >> @@ -42,6 +42,7 @@ SECTIONS >> *(.data.*) >> _edata = . ; >> } >> + . = ALIGN(32); > > Where does this magic 32 comes from? > I would assume the better choice would be: > . = ALIGN(L1_CACHE_BYTES); > > So we match the relevant CPU. > > In general for alignmnet of output sections I see the need for: > 1) Function call > 2) L1_CACHE_BYTES > 3) PAGE_SIZE > 4) 2*PAGE_SIZE > > But I see magic constant used here and there that does not match > the above (when looking at all archs). > So I act when I see a new 'magic' number.. > I totally agree gcc itself has a strange 32 bytes alignement rule (unless using -Os) for object of a >= 32 bytes size. Did you know that ? $ cat try.c char foo[32] = {1}; $ gcc -O -S try.c .file "try.c" .globl foo .data .align 32 <<< HERE , what a mess >> .type foo, @object .size foo, 32 foo: .byte 1 .zero 31 .ident "GCC: (GNU) 4.4.0" .section .note.GNU-stack,"", at progbits It makes many .o kernel files marked with a 2**5 alignement of .data or percpudata At link time, it creates many holes. In my opinion, gcc should have a separate option than -Os, as this as too expensive side effects on the code speed. I can save lot of data space if I patch gcc-4.4.0/config/i386/i386.c to : /* Compute the alignment for a static variable. TYPE is the data type, and ALIGN is the alignment that the object would ordinarily have. The value of this function is used instead of that alignment to align the object. */ int ix86_data_alignment (tree type, int align) { - int max_align = optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT); + int max_align = BITS_PER_WORD; if (AGGREGATE_TYPE_P (type) && TYPE_SIZE (type) && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < max_align) align = max_align; /* x86-64 ABI requires arrays greater than 16 bytes to be aligned to 16byte boundary. */ if (TARGET_64BIT) { if (AGGREGATE_TYPE_P (type) && TYPE_SIZE (type) && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128 || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128) return 128;