I've a problem with understanding the behaviour of the aligned
attribute. Mike Stump already gave an answer on the developer list,
where I mistakenly posted my question yesterday, but I'm posting
here, providing additional details, in the hope to get a pointer to
where the semantics of the aligned attribute is documented.
I'm looking at C++ using gcc 4.1.1. Is there any difference between C
and C++ in the handling of the aligned attribute?
The OS is Linux Fedora Core 6 on an Intel Core 2 Duo (X86_64
architecture).
What I'm trying to achieve is a 16 byte alignment for all allocation
types. Mike suggested to use a special purpose allocator, but I do
need to cover more than dynamic allocation: some of the objects I
need to be aligned will be allocated by the user and global and
automatic allocation needs to be covered.
The reason I need the alignment is to carve out space for 4 tag bits.
The only architecture I'm interested is Linux on X86_64 (and later
Mac OS X on 64 bit Intels).
The compilers of interest are GCC 4.1.1 and, unfortunately for me,
gcc 3.4.5 and 3.2.3.
A small example shows that gcc tries to obey requests for 16 byte
alignments. With the following:
[mav@thor aloha]$ g++ -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --
infodir=/usr/share/info --enable-shared --enable-threads=posix --
enable-checking=release --with-system-zlib --enable-__cxa_atexit --
disable-libunwind-exceptions --enable-libgcj-multifile --enable-
languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --
disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-
gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux
Thread model: posix
gcc version 4.1.1 20061011 (Red Hat 4.1.1-30)
And the following short program:
#include <iostream>
class C {
char c;
}; // __attribute__ ((aligned (16)));
C z;
char k;
C w;
int
main(int argc, char** argv) {
C c;
C d;
::std::cout << ::std::hex << "&c = 0x" << &c << "\n&d = 0x" << &d <<
"\n";
::std::cout << ::std::hex << "&z = 0x" << &z << "\n&w = 0x" << &w <<
"\n";
}
I get (compiled with g++ -o align align.cpp):
[ mav@thor aloha]$ ./align
&c = 0x0x7fff59d4149f
&d = 0x0x7fff59d4149e
&z = 0x0x601374
&w = 0x0x601376
If the aligned attribute is used, I get:
[mav@thor aloha]$ ./align
&c = 0x0x7fff92b26280
&d = 0x0x7fff92b26270
&z = 0x0x601380
&w = 0x0x6013a0
But when I move to my real application (an event-driven simulator) I
get stack allocated objects on 8-byte boundaries.
Now, my reading of the documentation is that the aligned attribute
(when used for types, as in the example I gave) applies to all
allocations (global, auto and heap). Is there any place where it is
said otherwise? Regardless of the documentation, what does the
implementation do?
The only caveat I've found is that the linker might be unable to obey
the alignment request. Here's the text from the doc:
Note that the effectiveness of aligned attributes may be limited by
inherent limitations in your linker. On many systems, the linker is
only able to arrange for
variables to be aligned up to a certain maximum alignment. (For some
linkers, the maximum supported alignment may be very very small.) If
your linker is
only able to align variables up to a maximum of 8 byte alignment,
then specifying aligned(16) in an __attribute__ will still only
provide you with 8 byte
alignment. See your linker documentation for further information.
The linker I'm using is GNU ld 2.17. The documentation contains no
machine-dependent section for X86_64 and I find no indication that a
16 byte alignment would not be allowed. That said, I have problems
understanding how the linker could even be in the picture for dynamic
and auto allocations.
In the first paragraph of the documentation for the aligned
attribute, it is said:
aligned (alignment)
This attribute specifies a minimum alignment (in bytes) for
variables of the specified type. For example, the declarations:
struct S { short f[3]; } __attribute__ ((aligned (8)));
typedef int more_aligned_int __attribute__ ((aligned
(8)));
force the compiler to insure (as far as it can) that each
variable whose type is struct S or more_aligned_int will be allocated
and aligned at least on a 8-byte boundary. On a SPARC, having all
variables of type struct S aligned to 8-byte boundaries allows the
compiler to use the ldd and std (doubleword load and
store) instructions when copying one variable of type struct S to
another, thus improving run-time efficiency.
Note that the alignment of any given struct or union type is
required by the ISO C standard to be at least a perfect multiple of
the lowest common multiple of the alignments of all of the members
of the struct or union in question. This means that you can
effectively adjust the alignment of a struct or union type by
attaching an aligned attribute to any one of the members of such a
type, but the notation illustrated in the example above is a more
obvious, intuitive, and
readable way to request the compiler to adjust the alignment of an
entire struct or union type.
This lead me to believe that every time a variable of type S is
allocated, the requested alignment is used (modulo linker limitations
of above). This paragraph is in the C section, so it doesn't,
strictly speaking, talk about C++ new style allocation, but (at least
in my mind) certainly covers automatic allocation on the stack.
And since there's no corresponding section in the C++ portion, I
assumed it was also the case for C++ style dynamic allocation.
Any input is greatly appreciated,
Thanks,
Maurizio