On Tue, 27 Jan 2015, David Daney wrote: > > > It is bizarre, and perhaps almost mind bending, but that seems to be how > > > it is > > > specified. Certainly the OCTEON implementation works this way. > > > > Well, I think this observation: > > > > "2.2.2.2 Memory Operation Functions > > > > "Regardless of byte ordering (big- or little-endian), the address of a > > halfword, word, or doubleword is the smallest byte address of the bytes > > that form the object. For big-endian ordering this is the > > most-significant byte; for a little-endian ordering this is the > > least-significant byte." > > > > contradicts your claim [...] > > One can argue about the meaning of the text in the reference manual. But in > the end, the behavior of real processors is what we are forced to deal with. > > In the case of all existing OCTEON processors, there is no Status[RE] bit, but > you can switch the endianess of the entire CPU under software control. I am > really making statements based on how they actually work, not assertions about > the meaning of the specification. However, I do believe that this is what is > specified. > > If you have access to processors with a working Status[RE] bit, you could > empirically determine how they work. Well, I do actually, I have a working machine driven by an R4000 processor. It was the original implementation of the Status.RE feature and therefore it can be used as the reference. I don't feel tempted to use my time to actually make any checks though. What I did instead, I checked the R4000 manual and the descriptions there are exactly the same as in the current MIPS architecture manual, down to using the same names like `BigEndianMem' or `StoreMemory'. Given that this is documentation that has been purposely prepared for a specific piece of silicon I have no reasons to believe it is inaccurate here. Furthermore, from your description, assuming that I understand it correctly, I infer that the reverse-endian mode as implemented by Octeon processors is completely useless and therefore a waste of silicon. Given the circumstances if I was a processor architecture implementer and was feaced with a useless optional feature, I would have either omitted it entirely or implemented it in a different, useful manner, as a vendor extension. Given that as you say you don't wire it to Status.RE anyway, as the architecture standard mandates, this is already a vendor extension so I fail to see a reason to avoid doing it correctly from the usability point of view, and then reporting observations back to architecture maintainers so that they can be taken into account in a future revision of the architecture standard. The conclusion is if we ever decide to implement it for Linux, then we'll probably have to include a small run-time check that upon the bootstrap makes the kernel switch into the reverse-endian user mode temporarily, executes a small piece there that stores some immediate data to memory, then traps back into the kernel to verify the byte order of stored data is sane and decides if to make the feature available to user software, before moving on. Maciej