On 02/05/2015 05:46 AM, Maciej W. Rozycki wrote:
On Tue, 27 Jan 2015, David Daney wrote:
It is bizarre, and perhaps almost mind bending, but that seems to be how
it is
specified. Certainly the OCTEON implementation works this way.
Well, I think this observation:
"2.2.2.2 Memory Operation Functions
"Regardless of byte ordering (big- or little-endian), the address of a
halfword, word, or doubleword is the smallest byte address of the bytes
that form the object. For big-endian ordering this is the
most-significant byte; for a little-endian ordering this is the
least-significant byte."
contradicts your claim [...]
One can argue about the meaning of the text in the reference manual. But in
the end, the behavior of real processors is what we are forced to deal with.
In the case of all existing OCTEON processors, there is no Status[RE] bit, but
you can switch the endianess of the entire CPU under software control. I am
really making statements based on how they actually work, not assertions about
the meaning of the specification. However, I do believe that this is what is
specified.
If you have access to processors with a working Status[RE] bit, you could
empirically determine how they work.
Well, I do actually, I have a working machine driven by an R4000
processor. It was the original implementation of the Status.RE feature
and therefore it can be used as the reference. I don't feel tempted to
use my time to actually make any checks though.
What I did instead, I checked the R4000 manual ...
You are still relying on your interpretation of the text, rather than
actual behavior of the device. It is not all surprising that your
interpretation of the manual hasn't changed, but it doesn't persuade me.
I am sticking to my belief that OCTEON faithfully implements the
specification with respect to the in-memory byte ordering of the various
load and store instructions. Switching the endianess of the processor
results in byte arrays being scrambled such that the low-order 3 bits
are XOR 7. This implies that aligned 64-bit loads and stores (LD, SD,
LLC, SCD) result in identical in-memory and in-register layout for
either endianess. This is quite handy when writing driver code for
devices that have 64-bit registers.
[...]
David Daney