What seems odd to me is that packed structures accesses are inherently
less efficient than non-packed structures.
In my example, the 3 lbz instructions instead of one lwz require 3
memory accesses instead of 1, that is a penalty of 2 extra memory access
over the slow bus, and in addition to that there is extra penalty when
the bit field overlaps byte boundary (as in my example), where GCC must
generate extra code to "or" those bytes, which, BTW, in my opinion
contradicts what you wrote earlier:
David Edelsohn wrote:
" The lbz has to do with the size and the packed alignment. With the
packed structure, GCC chooses the smallest memory access that covers the
bitfield. Once GCC has chosen bytes, it cannot merge the loads together.
If the structure were not declared packed, GCC would use wider loads
with masking, and then determine that the loads refer to the same object."
In my case it shouldn't have chosen byte because it doesn't cover the bitfield that spans over byte boundary. I don't know whether what GCC does is "Right", and I guess if it was implemented in 4.1 somebody decided that it was "Right", but, if the code generated is 3 times the instruction count, and 3 times the memory accesses, for no apparent reason, then I can't see any reason why anyone would want this behavior. I mean the code produced in 4.0.1 for the same structure accessed not through a pointer is just fine, why break it like that? Something just doesn't seem right, I'm sorry.
I think I can summarize it by saying that if it's less efficient then
there is no justification for it.
Yaro
David Edelsohn wrote:
John Yates writes:
John> Do I read this correctly? Are you truly saying that two structs
John> with identical layout will trigger different code sequences just
John> because one was declared packed?
Yes. Why is that strange? attribute packed assigns the smallest
possible alignment so that the compiler composes the layout of the
structure or bitfield in the more compact form possible. Even if the
layout produced is the same, the smaller alignment is carried around with
the fields and causes the compiler to use more conservative access
operations.
David