On Fri, 1 Mar 2019, Arnd Bergmann wrote: > > I think the whole point of __raw_xyz() is that it's the lowest level > > model. It gives you relaxed ordering (together with the ioremap > > model), and it gives you straight-through behavior. > > > > And yes, any driver using them needs to be aware of the byte ordering, > > which may or may not be the same as regular memory, and may or may not > > be the same as other devices. > > > > So __raw_xyz() is very much for low-level drivers that know what they > > are doing. Caveat user. > > > > "If it breaks, you get to keep both pieces" > > I agree in principle, but I think we already have a lot of precedence > for __raw_xyz() being relied on having a specific behavior in > architecture independent drivers, and I think it makes sense for > architectures to provide that. > > Specifically, I think we need __raw_xyz() to do the same as xyz() > on all little-endian kernels regarding byte ordering (not barriers), and > I would expect it to provide the same ordering and addressing > as swabX(xyz()) on big-endian kernels. > > Without that, using __raw_xyz() to copy between RAM and > buffers in PCI memory space is broken, as you said, but the > assumption would be broken on certain older machines that > do a hardware endian swap by swizzling the address lines rather > than swapping bytes on the data bus. Ah, swizzling! Thanks for reminding me of that. The swizzling machines will be those that do bit (rather than byte) lane matching in hardware and consequently produce a numeric value of data that is consistent between the CPU and a device when accessed in bus-width quantities. These do indeed require address adjustment if you access a narrower quantity. > The best idea I have for working around this is to never rely > on __raw_xyz() to not do byte swapping in platform specific > drivers with CPU-endian MMIO space, but to have a platform > specific set of wrappers around the normal I/O functions, and > make __raw_xyz() just do whatever we expect them to do on > PCI devices. For the record in the MIPS port we have the following accessors: 1. `readX', `writeX', `inX', `outX': data passed in the CPU endianness, ordered WRT MMIO and (for reads) WRT memory; suitable for device CSR accesses, 2. `__mem_readX', `__mem_writeX', `__mem_inX', `__mem_outX': data passed in the memory endianness, ordered WRT MMIO and (for reads) WRT memory, suitable for data transfers between memory and a device, e.g. PIO ATA, 3. `__relaxed_readX', `__relaxed_writeX': data passed in the CPU endianness, ordered WRT MMIO only; aliased to `*_relaxed', 4. `__raw_readX', `__raw_writeX': data passed straight-through with no transformation, ordered WRT MMIO and (for reads) WRT memory. For correct operation of generic drivers we need at least #1 and #2, and we need #4 for platform drivers (which will know they're on a local bus); #3 is an optimisation only that is nice having, but drivers would do with #1 instead. Note however that depending on the wiring of a particular big-endian machine we may have to swizzle addresses for #1, #2, #3, and then byte-swap either #1 and #3, or #2. See arch/mips/include/asm/*/mangle-port.h for all the mess. It actually took quite a while back in early 2000s to get PIO ATA to work correctly on big-endian MIPS machines, because people did not understand the subtlety between the CPU and the memory endianness, and consequently that different accessors are required for the 16-bit ATA data register; you just can't get 16-bit ATA data transfers and 16-bit CSR accesses made elsewhere right with the use of a single accessor only, we need two. I haven't looked at what other ports did in this area, but we do need at least #1 and #2, strongly-ordered, kernel-wide, and then we can discuss what the semantics of `__raw_xyz' is supposed to be, but at least some ports will have to retain the semantics of #4, strongly-ordered, under whatever name. I think using a uniform one would be advantageous though, for obvious reasons. NB the old IDE driver had special local provisions for PIO, which wired #2 accessors on MIPS by including a platform specific header that provided IDE-specific names and I suppose libata carried that over or it wouldn't work. But I think that instead of having such driver-specific hacks that require port maintainers' attention on a case-by-case basis we really ought to provide #2 semantics kernel-wide. Maciej