On Thu, 5 Feb 2015, David Daney wrote: > > Well, I do actually, I have a working machine driven by an R4000 > > processor. It was the original implementation of the Status.RE feature > > and therefore it can be used as the reference. I don't feel tempted to > > use my time to actually make any checks though. > > > > What I did instead, I checked the R4000 manual ... > > You are still relying on your interpretation of the text, rather than actual > behavior of the device. It is not all surprising that your interpretation of > the manual hasn't changed, but it doesn't persuade me. > > I am sticking to my belief that OCTEON faithfully implements the specification > with respect to the in-memory byte ordering of the various load and store > instructions. Switching the endianess of the processor results in byte arrays > being scrambled such that the low-order 3 bits are XOR 7. This implies that > aligned 64-bit loads and stores (LD, SD, LLC, SCD) result in identical > in-memory and in-register layout for either endianess. This is quite handy > when writing driver code for devices that have 64-bit registers. Fair enough, this helps interfacing fixed-endian peripherals such as a PCI bus. Some MIPS-based SOCs map PCI/memory twice in the bus address space for the benefit of big-endian systems, once with a byte lane matching policy and again with a bit lane matching policy. This results in a swapped memory view between the two mapping spaces as seen by PCI devices doing DMA. What you describe refers to the bit lane matching policy which has benefits for PIO and MMIO as values written to peripheral registers do not change with a host bus endianness change (as long as accesses are as you noted only made using a specific data width intended), in contrast to DMA where the byte lane matching policy makes more sense as it makes byte streams written to memory the same regardless of the host bus endianness. What does it have to do with the user mode though? Device drivers do not usually run in the user mode and even if they do (such as X11 DDX), then what would be the benefit for them from running in the reverse-endian mode? They'd have to cope with the rest of the environment being byte-swapped anyway. Having say a MMIO resource mapped as a region configured in hardware for swapping with the bit lane matching policy would make more sense than having the whole user binary (here the X server) built for and run with the opposite endianness. The use of CP0.Status.RE is different and it has to be implemented such as to fulfil its purpose. That for example may be running little-endian DEC Ultrix/MIPS user binaries under a foreign personality on a big-endian MIPS machine running SGI IRIX or Linux. Of course with the demise of proprietary *nix systems for the MIPS processor such a feature seems little useful. Maciej