On 23/03/2021 19.59, Oliver Hartkopp wrote: > > > On 23.03.21 15:00, Rasmus Villemoes wrote: >> Now what CONFIG_* knobs are responsible for putting -mabi=apcs-gnu in >> CFLAGS is left as an exercise for the reader. Regardless, it is not a >> bug in the compiler. The error is the assumption that this language >> >> "Aggregates and Unions >> >> Structures and unions assume the alignment of their most strictly >> aligned component. > > (parse error in sentence) It was a direct quote, but I can try to paraphrase with an example. If you have a struct foo { T1 m1; T2 m2; T3 m3; }, then alignof(struct foo) = max(alignof(T1), alignof(T2), alignof(T3)). Same for a "union foo". But this is specifically for x86-64; for (some flavors of) ARM, other rules apply - namely, alignof(T) is 4 unless T is char or short (or (un)signed variants), ignoring bitfields which have their own rules. Note that while union u {char a; char b;} has alignment 4 on ARM and 1 on x86-64, other types are less strictly aligned on ARM; e.g. s64 aka long long is 8-byte aligned on x86-64 but (still) just 4-byte aligned on ARM. And again, this is just for specific -mabi= options. >> Each member is assigned to the lowest available offset with the >> appropriate >> alignment. The size of any object is always a multiple of the object‘s >> alignment." >> >> from the x86-64 ABI applies on all other architectures/ABIs. >> >>> I'm not a compiler expert but this does not seem to be consistent. >>> >>> Especially as we only have byte sizes (inside and outside of the union) >>> and "A field with a char type is aligned to the next available byte." >> >> Yes, and that's exactly what you got before the anon union was >> introduced. > > Before(!) the union there is nothing to pad. Just to be clear, my "before" was in the temporal sense, i.e. "prior to commit ea7800565a128", all the u8s in struct can_frame were placed one after the other. But after that commit, struct can_frame has a new member replacing can_dlc which happens to occupy 4 bytes (for some ABIs), pushing the subsequent members __pad, __res0 and len8_dlc (formerly known as __res1) ahead. >>> The union is indeed aligned to the word boundary - but the following >>> byte is not aligned to the next available byte. >> >> Yes it is, because the union occupies 4 bytes. The first byte is shared >> by the two char members, the remaining three bytes are padding. > > But why is the union 4 bytes long here and adds a padding of three bytes > at the end? Essentially, because arrays. It's true for _any_ type T that sizeof(T) must be a multiple of alignof(T). Take an array "T x[9]". If x[0] is 4-byte aligned, then in order for x[1] to be 4-byte aligned as well, x[0] must occupy a multiple of 4 bytes. It doesn't matter at all that this happens to be an anonymous union. Layout-wise, you could as well have a definition union uuu { __u8 len; __u8 can_dlc; } and made struct can_frame struct can_frame { canid_t can_id; union uuu u; __u8 __pad; ... }; (you lose the anonymity trick so you'd have to do frame->u.can_dlc instead of just frame->can_dlc). You have a member with alignof()==4 and sizeof()==4; that sizeof() cannot magically become 1 just because that particular instance of the type is not part of an array. Imagine what would happen if the compiler pulled subsequent char members into trailing padding of a previous compound member. E.g. consider struct a { int x; char y; } // alignof==4, sizeof==8, offsetof(y)==4 struct b { struct a a; char z; } If I have a "struct b *b", I'm allowed to do "&b->a" and get a "pointer to struct a". Then I can do memset(&b->a, 0, sizeof(struct a)). Clearly, z must not have been placed inside the trailing padding of struct a. Rasmus