On Mon, Feb 03, 2025 at 12:43:01PM +0100, Arnd Bergmann wrote: > On Mon, Feb 3, 2025, at 09:25, Thomas Weißschuh wrote: > > On Mon, Feb 03, 2025 at 09:18:33AM +0100, Arnd Bergmann wrote: > >> I have not tried it yet, but I suspect this is not the correct > >> fix here. Unfortunately the indirect header inclusions in this > >> file are way too complicated with corner cases in various > >> architectures. How much testing have you given your patch > >> across other targets? I think the last time we tried to address > >> it, we broke mips or parisc. > > > > It was build-tested on 0day. > > I also gave it some light boot testing on kunit/qemu. > > (Neither on mips or parisc) > > Ok, I see. I checked the usage of these functions again and > now remembered the reason we didn't fix it last time, which is > that the semantics are inconsistent across architectures > and I think this should be addressed first, so we can untangle > the definitions: > > There is one driver that is specific to ARM processors > (drivers/firmware/arm_scmi) using ioread64_hi_lo/iowrite64_hi_lo > and this uses the documented semantics from > Documentation/driver-api/device-io.rst, which says that the > helpers always do separate 32-bit accesses (even on 64-bit > machines), but that it's possible to just call > ioread64()/iowrite64() after including linux/io-64-nonatomic-hi-lo.h > in order to always get 64-bit accesses on 64-bit architectures > but get 32-bit accesses on 32-bit architectures. There are > another dozen or so drivers that do this. > > There are two other drivers that were written for x86 that > use other semantics (drivers/net/wwan/t7xx/ and > drivers/ptp/ptp_pch.c): Here, the definition from lib/iomap.c > means that even on 64-bit architectures calling > ioread64_hi_lo/iowrite64_hi_lo on an MMIO space always > results in a 64-bit access, Interesting... I believe both cases mentioned in this paragraph were written with only the include/linux/io-64*.h in mind. > and only the PIO version is split > up. On 32-bit architectures, this part is not provided, so > it should fall back to split access (I think this is where > we had bugs in the past). > > Another complication is that alpha, parisc and sh (not mips > any more) explicitly include asm-generic/iomap.h but don't > select CONFIG_GENERIC_IOMAP. At this time, GENERIC_IOMAP > is selected at least in some configurations on m68k, mips, > powerpc, sh, um and x86, but most of these should not do that > (mips, m68k and sh have no PIO instructions, powerpc had > a hack that I think was just removed but needs more cleanup). > > I'm testing with the patch below now, which separates the > CONFIG_GENERIC_IOMAP implementation (with the 32-bit PIO > access) from the normal version, and picks an appropriate > one in linux/io-64-nonatomic-*.h based on the architecture > to avoid some of the more confusing nested > "#ifdef ioread64" etc checks. > > I checked that none of the three drivers ever actually uses > PIO registers instead of MMIO, and since none of them use the > big-endian accessors, this all turns into readq/writeq > in practice anyway. > > The ptp_pch.c driver still needs more work as I noticed two > issues there: the driver has a mix of lo_hi and hi_lo > variants, but it is unclear whether is is actually required > on 32-bit or if the hardware works the same either way. PTP is special, some registers are related to a read timestamp IIRC and hence hi_lo is a must there. > In addition, these seem to be timer registers that may overrun > from the lo into the hi field between the two accesses, so > technically a 32-bit host needs to do an extra read to > check for overflow and possibly repeat the access. Yes, precisely why hi_lo is used to minimize the error when it races like this. But IIRC *_pch drivers are for 32-bit platform, the code, if so, was made to be compiled on 64-bit but never used IRL, just for test coverage. (I believe the PCH stands for EG20 PCH, I have [32-bit] boards with it.) ... I like the lib/* and include/* cleanup but PTP probably should stay as is. OTOH WWAN case most likely had been tested on 64-bit platforms only and proves that readq()/writeq() works there, so it's okay to unify. -- With Best Regards, Andy Shevchenko