aligned attribute

Maurizio Vitale <maurizio.vitale@xxxxxxxxxxxxxxxxxxxxxx> · Wed, 20 Dec 2006 08:27:27 -0500

I've a problem with understanding the behaviour of the aligned  
attribute. Mike Stump already gave an answer on the developer list,  
where I mistakenly posted my question yesterday, but I'm posting  
here, providing additional details, in the hope to get a pointer to  
where the semantics of the aligned attribute is documented.

I'm looking at C++ using gcc 4.1.1. Is there any difference between C  
and C++ in the handling of the aligned attribute?
The OS is Linux Fedora Core 6 on an Intel Core 2 Duo (X86_64  
architecture).

What I'm trying to achieve is a 16 byte alignment for all allocation  
types. Mike suggested to use a special purpose allocator, but I do  
need to cover more than dynamic allocation: some of the objects I  
need to be aligned will be allocated by the user and global and  
automatic allocation needs to be covered.
The reason I need the alignment is to carve out space for 4 tag bits.  
The only architecture I'm interested is Linux on X86_64 (and later  
Mac OS X on 64 bit Intels).
The compilers of interest are GCC 4.1.1 and, unfortunately for me,  
gcc 3.4.5 and 3.2.3.

A small example shows that gcc tries to obey requests for 16 byte  
alignments. With the following:

[mav@thor aloha]$ g++ -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man -- 
infodir=/usr/share/info --enable-shared --enable-threads=posix -- 
enable-checking=release --with-system-zlib --enable-__cxa_atexit -- 
disable-libunwind-exceptions --enable-libgcj-multifile --enable- 
languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk -- 
disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2- 
gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux
Thread model: posix
gcc version 4.1.1 20061011 (Red Hat 4.1.1-30)

And the following short program:
#include <iostream>

class C {
char c;
}; // __attribute__ ((aligned (16)));

C z;
char k;
C w;

int
main(int argc, char** argv) {
        C c;
        C d;
::std::cout << ::std::hex << "&c = 0x" << &c << "\n&d = 0x" << &d <<  
"\n";
::std::cout << ::std::hex << "&z = 0x" << &z << "\n&w = 0x" << &w <<  
"\n";
}

I get (compiled with g++ -o align align.cpp):
[	mav@thor aloha]$ ./align
	&c = 0x0x7fff59d4149f
	&d = 0x0x7fff59d4149e
	&z = 0x0x601374
	&w = 0x0x601376

If the aligned attribute is used, I get:
	[mav@thor aloha]$ ./align
	&c = 0x0x7fff92b26280
	&d = 0x0x7fff92b26270
	&z = 0x0x601380
	&w = 0x0x6013a0

But when I move to my real application (an event-driven simulator) I  
get stack allocated objects on 8-byte boundaries.

Now, my reading of the documentation is that the aligned attribute  
(when used for types, as in the example I gave) applies to all  
allocations (global, auto and heap). Is there any place where it is  
said otherwise? Regardless of the documentation, what does the  
implementation do?

The only caveat I've found is that the linker might be unable to obey  
the alignment request. Here's the text from the doc:
	Note that the effectiveness of aligned attributes may be limited by  
inherent limitations in your linker. On many systems, the linker is  
only able to arrange for 	
	variables to be aligned up to a certain maximum alignment. (For some  
linkers, the maximum supported alignment may be very very small.) If  
your linker is
	only able to align variables up to a maximum of 8 byte alignment,  
then specifying aligned(16) in an __attribute__ will still only  
provide you with 8 byte
	alignment. See your linker documentation for further information.

The linker I'm using is GNU ld 2.17. The documentation contains no  
machine-dependent section for X86_64 and I find no indication that a  
16 byte alignment would not be allowed. That said, I have problems  
understanding how the linker could even be in the picture for dynamic  
and auto allocations.

In the first paragraph of the documentation for the aligned  
attribute, it is said:

	aligned (alignment)
	    This attribute specifies a minimum alignment (in bytes) for  
variables of the specified type. For example, the declarations:

	              struct S { short f[3]; } __attribute__ ((aligned (8)));
	              typedef int more_aligned_int __attribute__ ((aligned  
(8)));

	    force the compiler to insure (as far as it can) that each  
variable whose type is struct S or more_aligned_int will be allocated  
and aligned at least on a 8-byte 	boundary. On a SPARC, having all  
variables of type struct S aligned to 8-byte boundaries allows the  
compiler to use the ldd and std (doubleword load and 	
	store) instructions when copying one variable of type struct S to  
another, thus improving run-time efficiency.

 	   Note that the alignment of any given struct or union type is  
required by the ISO C standard to be at least a perfect multiple of  
the lowest common multiple of 	the alignments of all of the members  
of the struct or union in question. This means that you can  
effectively adjust the alignment of a struct or union type by 	
	attaching an aligned attribute to any one of the members of such a  
type, but the notation illustrated in the example above is a more  
obvious, intuitive, and 	
	readable way to request the compiler to adjust the alignment of an  
entire struct or union type.

This lead me to believe that every time a variable of type S is  
allocated, the requested alignment is used (modulo linker limitations  
of above). This paragraph is in the C section, so it doesn't,  
strictly speaking, talk about C++ new style allocation, but (at least  
in my mind) certainly covers automatic allocation on the stack.
And since there's no corresponding section in the C++ portion, I  
assumed it was also the case for C++ style dynamic allocation.

Any input is greatly appreciated,
Thanks,

		Maurizio