Re: Unaligned access to packed structs on ppc405

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



What seems odd to me is that packed structures accesses are inherently less efficient than non-packed structures. In my example, the 3 lbz instructions instead of one lwz require 3 memory accesses instead of 1, that is a penalty of 2 extra memory access over the slow bus, and in addition to that there is extra penalty when the bit field overlaps byte boundary (as in my example), where GCC must generate extra code to "or" those bytes, which, BTW, in my opinion contradicts what you wrote earlier:

David Edelsohn wrote:
" The lbz has to do with the size and the packed alignment. With the packed structure, GCC chooses the smallest memory access that covers the bitfield. Once GCC has chosen bytes, it cannot merge the loads together. If the structure were not declared packed, GCC would use wider loads with masking, and then determine that the loads refer to the same object."

In my case it shouldn't have chosen byte because it doesn't cover the bitfield that spans over byte boundary. I don't know whether what GCC does is "Right", and I guess if it was implemented in 4.1 somebody decided that it was "Right", but, if the code generated is 3 times the instruction count, and 3 times the memory accesses, for no apparent reason, then I can't see any reason why anyone would want this behavior. I mean the code produced in 4.0.1 for the same structure accessed not through a pointer is just fine, why break it like that? Something just doesn't seem right, I'm sorry.

I think I can summarize it by saying that if it's less efficient then there is no justification for it.

Yaro



David Edelsohn wrote:
John Yates writes:

John> Do I read this correctly?  Are you truly saying that two structs
John> with identical layout will trigger different code sequences just
John> because one was declared packed?

	Yes.  Why is that strange?  attribute packed assigns the smallest
possible alignment so that the compiler composes the layout of the
structure or bitfield in the more compact form possible.  Even if the
layout produced is the same, the smaller alignment is carried around with
the fields and causes the compiler to use more conservative access
operations.
David



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux