On Mon, Feb 3, 2025, at 09:25, Thomas Weißschuh wrote: > On Mon, Feb 03, 2025 at 09:18:33AM +0100, Arnd Bergmann wrote: >> I have not tried it yet, but I suspect this is not the correct >> fix here. Unfortunately the indirect header inclusions in this >> file are way too complicated with corner cases in various >> architectures. How much testing have you given your patch >> across other targets? I think the last time we tried to address >> it, we broke mips or parisc. > > It was build-tested on 0day. > I also gave it some light boot testing on kunit/qemu. > (Neither on mips or parisc) Ok, I see. I checked the usage of these functions again and now remembered the reason we didn't fix it last time, which is that the semantics are inconsistent across architectures and I think this should be addressed first, so we can untangle the definitions: There is one driver that is specific to ARM processors (drivers/firmware/arm_scmi) using ioread64_hi_lo/iowrite64_hi_lo and this uses the documented semantics from Documentation/driver-api/device-io.rst, which says that the helpers always do separate 32-bit accesses (even on 64-bit machines), but that it's possible to just call ioread64()/iowrite64() after including linux/io-64-nonatomic-hi-lo.h in order to always get 64-bit accesses on 64-bit architectures but get 32-bit accesses on 32-bit architectures. There are another dozen or so drivers that do this. There are two other drivers that were written for x86 that use other semantics (drivers/net/wwan/t7xx/ and drivers/ptp/ptp_pch.c): Here, the definition from lib/iomap.c means that even on 64-bit architectures calling ioread64_hi_lo/iowrite64_hi_lo on an MMIO space always results in a 64-bit access, and only the PIO version is split up. On 32-bit architectures, this part is not provided, so it should fall back to split access (I think this is where we had bugs in the past). Another complication is that alpha, parisc and sh (not mips any more) explicitly include asm-generic/iomap.h but don't select CONFIG_GENERIC_IOMAP. At this time, GENERIC_IOMAP is selected at least in some configurations on m68k, mips, powerpc, sh, um and x86, but most of these should not do that (mips, m68k and sh have no PIO instructions, powerpc had a hack that I think was just removed but needs more cleanup). I'm testing with the patch below now, which separates the CONFIG_GENERIC_IOMAP implementation (with the 32-bit PIO access) from the normal version, and picks an appropriate one in linux/io-64-nonatomic-*.h based on the architecture to avoid some of the more confusing nested "#ifdef ioread64" etc checks. I checked that none of the three drivers ever actually uses PIO registers instead of MMIO, and since none of them use the big-endian accessors, this all turns into readq/writeq in practice anyway. The ptp_pch.c driver still needs more work as I noticed two issues there: the driver has a mix of lo_hi and hi_lo variants, but it is unclear whether is is actually required on 32-bit or if the hardware works the same either way. In addition, these seem to be timer registers that may overrun from the lo into the hi field between the two accesses, so technically a 32-bit host needs to do an extra read to check for overflow and possibly repeat the access. Arnd diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c index 60050da54bf2..1c75a4c9c371 100644 --- a/drivers/firmware/arm_scmi/driver.c +++ b/drivers/firmware/arm_scmi/driver.c @@ -1997,17 +1997,7 @@ static void scmi_common_fastchannel_db_ring(struct scmi_fc_db_info *db) else if (db->width == 4) SCMI_PROTO_FC_RING_DB(32); else /* db->width == 8 */ -#ifdef CONFIG_64BIT SCMI_PROTO_FC_RING_DB(64); -#else - { - u64 val = 0; - - if (db->mask) - val = ioread64_hi_lo(db->addr) & db->mask; - iowrite64_hi_lo(db->set | val, db->addr); - } -#endif } /** diff --git a/drivers/net/wwan/t7xx/t7xx_cldma.c b/drivers/net/wwan/t7xx/t7xx_cldma.c index f0a4783baf1f..9f43f256db1d 100644 --- a/drivers/net/wwan/t7xx/t7xx_cldma.c +++ b/drivers/net/wwan/t7xx/t7xx_cldma.c @@ -106,7 +106,7 @@ bool t7xx_cldma_tx_addr_is_set(struct t7xx_cldma_hw *hw_info, unsigned int qno) { u32 offset = REG_CLDMA_UL_START_ADDRL_0 + qno * ADDR_SIZE; - return ioread64_lo_hi(hw_info->ap_pdn_base + offset); + return ioread64(hw_info->ap_pdn_base + offset); } void t7xx_cldma_hw_set_start_addr(struct t7xx_cldma_hw *hw_info, unsigned int qno, u64 address, @@ -117,7 +117,7 @@ void t7xx_cldma_hw_set_start_addr(struct t7xx_cldma_hw *hw_info, unsigned int qn reg = tx_rx == MTK_RX ? hw_info->ap_ao_base + REG_CLDMA_DL_START_ADDRL_0 : hw_info->ap_pdn_base + REG_CLDMA_UL_START_ADDRL_0; - iowrite64_lo_hi(address, reg + offset); + iowrite64(address, reg + offset); } void t7xx_cldma_hw_resume_queue(struct t7xx_cldma_hw *hw_info, unsigned int qno, diff --git a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c index 97163e1e5783..13f5a2442d20 100644 --- a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c +++ b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c @@ -137,9 +137,9 @@ static int t7xx_cldma_gpd_rx_from_q(struct cldma_queue *queue, int budget, bool return -ENODEV; } - gpd_addr = ioread64_lo_hi(hw_info->ap_pdn_base + - REG_CLDMA_DL_CURRENT_ADDRL_0 + - queue->index * sizeof(u64)); + gpd_addr = ioread64(hw_info->ap_pdn_base + + REG_CLDMA_DL_CURRENT_ADDRL_0 + + queue->index * sizeof(u64)); if (req->gpd_addr == gpd_addr || hwo_polling_count++ >= 100) return 0; @@ -317,8 +317,8 @@ static void t7xx_cldma_txq_empty_hndl(struct cldma_queue *queue) struct t7xx_cldma_hw *hw_info = &md_ctrl->hw_info; /* Check current processing TGPD, 64-bit address is in a table by Q index */ - ul_curr_addr = ioread64_lo_hi(hw_info->ap_pdn_base + REG_CLDMA_UL_CURRENT_ADDRL_0 + - queue->index * sizeof(u64)); + ul_curr_addr = ioread64(hw_info->ap_pdn_base + REG_CLDMA_UL_CURRENT_ADDRL_0 + + queue->index * sizeof(u64)); if (req->gpd_addr != ul_curr_addr) { spin_unlock_irqrestore(&md_ctrl->cldma_lock, flags); dev_err(md_ctrl->dev, "CLDMA%d queue %d is not empty\n", diff --git a/drivers/net/wwan/t7xx/t7xx_pcie_mac.c b/drivers/net/wwan/t7xx/t7xx_pcie_mac.c index f071ec7ff23d..76da4c15e3de 100644 --- a/drivers/net/wwan/t7xx/t7xx_pcie_mac.c +++ b/drivers/net/wwan/t7xx/t7xx_pcie_mac.c @@ -75,7 +75,7 @@ static void t7xx_pcie_mac_atr_tables_dis(void __iomem *pbase, enum t7xx_atr_src_ for (i = 0; i < ATR_TABLE_NUM_PER_ATR; i++) { offset = ATR_PORT_OFFSET * port + ATR_TABLE_OFFSET * i; reg = pbase + ATR_PCIE_WIN0_T0_ATR_PARAM_SRC_ADDR + offset; - iowrite64_lo_hi(0, reg); + iowrite64(0, reg); } } @@ -112,17 +112,17 @@ static int t7xx_pcie_mac_atr_cfg(struct t7xx_pci_dev *t7xx_dev, struct t7xx_atr_ reg = pbase + ATR_PCIE_WIN0_T0_TRSL_ADDR + offset; value = cfg->trsl_addr & ATR_PCIE_WIN0_ADDR_ALGMT; - iowrite64_lo_hi(value, reg); + iowrite64(value, reg); reg = pbase + ATR_PCIE_WIN0_T0_TRSL_PARAM + offset; iowrite32(cfg->trsl_id, reg); reg = pbase + ATR_PCIE_WIN0_T0_ATR_PARAM_SRC_ADDR + offset; value = (cfg->src_addr & ATR_PCIE_WIN0_ADDR_ALGMT) | (atr_size << 1) | BIT(0); - iowrite64_lo_hi(value, reg); + iowrite64(value, reg); /* Ensure ATR is set */ - ioread64_lo_hi(reg); + ioread64(reg); return 0; } diff --git a/drivers/ptp/ptp_pch.c b/drivers/ptp/ptp_pch.c index b8a9a54a176c..1d3b1f83cda8 100644 --- a/drivers/ptp/ptp_pch.c +++ b/drivers/ptp/ptp_pch.c @@ -13,7 +13,6 @@ #include <linux/interrupt.h> #include <linux/io.h> #include <linux/io-64-nonatomic-lo-hi.h> -#include <linux/io-64-nonatomic-hi-lo.h> #include <linux/irq.h> #include <linux/kernel.h> #include <linux/module.h> @@ -147,14 +146,14 @@ static u64 pch_systime_read(struct pch_ts_regs __iomem *regs) { u64 ns; - ns = ioread64_lo_hi(®s->systime_lo); + ns = ioread64(®s->systime_lo); return ns << TICKS_NS_SHIFT; } static void pch_systime_write(struct pch_ts_regs __iomem *regs, u64 ns) { - iowrite64_lo_hi(ns >> TICKS_NS_SHIFT, ®s->systime_lo); + iowrite64(ns >> TICKS_NS_SHIFT, ®s->systime_lo); } static inline void pch_block_reset(struct pch_dev *chip) @@ -221,7 +220,7 @@ u64 pch_rx_snap_read(struct pci_dev *pdev) struct pch_dev *chip = pci_get_drvdata(pdev); u64 ns; - ns = ioread64_lo_hi(&chip->regs->rx_snap_lo); + ns = ioread64(&chip->regs->rx_snap_lo); return ns << TICKS_NS_SHIFT; } @@ -232,7 +231,7 @@ u64 pch_tx_snap_read(struct pci_dev *pdev) struct pch_dev *chip = pci_get_drvdata(pdev); u64 ns; - ns = ioread64_lo_hi(&chip->regs->tx_snap_lo); + ns = ioread64(&chip->regs->tx_snap_lo); return ns << TICKS_NS_SHIFT; } @@ -283,7 +282,7 @@ int pch_set_station_address(u8 *addr, struct pci_dev *pdev) } dev_dbg(&pdev->dev, "invoking pch_station_set\n"); - iowrite64_lo_hi(mac, &chip->regs->ts_st); + iowrite64(mac, &chip->regs->ts_st); return 0; } EXPORT_SYMBOL(pch_set_station_address); @@ -305,7 +304,7 @@ static irqreturn_t isr(int irq, void *priv) if (pch_dev->exts0_enabled) { event.type = PTP_CLOCK_EXTTS; event.index = 0; - event.timestamp = ioread64_hi_lo(®s->asms_hi); + event.timestamp = ioread64(®s->asms_hi); event.timestamp <<= TICKS_NS_SHIFT; ptp_clock_event(pch_dev->ptp_clock, &event); } @@ -316,7 +315,7 @@ static irqreturn_t isr(int irq, void *priv) if (pch_dev->exts1_enabled) { event.type = PTP_CLOCK_EXTTS; event.index = 1; - event.timestamp = ioread64_hi_lo(®s->asms_hi); + event.timestamp = ioread64(®s->asms_hi); event.timestamp <<= TICKS_NS_SHIFT; ptp_clock_event(pch_dev->ptp_clock, &event); } @@ -493,7 +492,7 @@ pch_probe(struct pci_dev *pdev, const struct pci_device_id *id) pch_reset(chip); iowrite32(DEFAULT_ADDEND, &chip->regs->addend); - iowrite64_lo_hi(1, &chip->regs->trgt_lo); + iowrite64(1, &chip->regs->trgt_lo); iowrite32(PCH_TSE_TTIPEND, &chip->regs->event); pch_eth_enable_set(chip); diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h index 196087a8126e..9f3f25d7fc58 100644 --- a/include/asm-generic/iomap.h +++ b/include/asm-generic/iomap.h @@ -31,42 +31,22 @@ extern unsigned int ioread16(const void __iomem *); extern unsigned int ioread16be(const void __iomem *); extern unsigned int ioread32(const void __iomem *); extern unsigned int ioread32be(const void __iomem *); -#ifdef CONFIG_64BIT -extern u64 ioread64(const void __iomem *); -extern u64 ioread64be(const void __iomem *); -#endif -#ifdef readq -#define ioread64_lo_hi ioread64_lo_hi -#define ioread64_hi_lo ioread64_hi_lo -#define ioread64be_lo_hi ioread64be_lo_hi -#define ioread64be_hi_lo ioread64be_hi_lo -extern u64 ioread64_lo_hi(const void __iomem *addr); -extern u64 ioread64_hi_lo(const void __iomem *addr); -extern u64 ioread64be_lo_hi(const void __iomem *addr); -extern u64 ioread64be_hi_lo(const void __iomem *addr); -#endif +extern u64 __ioread64_lo_hi(const void __iomem *addr); +extern u64 __ioread64_hi_lo(const void __iomem *addr); +extern u64 __ioread64be_lo_hi(const void __iomem *addr); +extern u64 __ioread64be_hi_lo(const void __iomem *addr); extern void iowrite8(u8, void __iomem *); extern void iowrite16(u16, void __iomem *); extern void iowrite16be(u16, void __iomem *); extern void iowrite32(u32, void __iomem *); extern void iowrite32be(u32, void __iomem *); -#ifdef CONFIG_64BIT -extern void iowrite64(u64, void __iomem *); -extern void iowrite64be(u64, void __iomem *); -#endif -#ifdef writeq -#define iowrite64_lo_hi iowrite64_lo_hi -#define iowrite64_hi_lo iowrite64_hi_lo -#define iowrite64be_lo_hi iowrite64be_lo_hi -#define iowrite64be_hi_lo iowrite64be_hi_lo -extern void iowrite64_lo_hi(u64 val, void __iomem *addr); -extern void iowrite64_hi_lo(u64 val, void __iomem *addr); -extern void iowrite64be_lo_hi(u64 val, void __iomem *addr); -extern void iowrite64be_hi_lo(u64 val, void __iomem *addr); -#endif +extern void __iowrite64_lo_hi(u64 val, void __iomem *addr); +extern void __iowrite64_hi_lo(u64 val, void __iomem *addr); +extern void __iowrite64be_lo_hi(u64 val, void __iomem *addr); +extern void __iowrite64be_hi_lo(u64 val, void __iomem *addr); /* * "string" versions of the above. Note that they diff --git a/include/linux/io-64-nonatomic-lo-hi.h b/include/linux/io-64-nonatomic-lo-hi.h index 448a21435dba..70d091769baa 100644 --- a/include/linux/io-64-nonatomic-lo-hi.h +++ b/include/linux/io-64-nonatomic-lo-hi.h @@ -101,22 +101,38 @@ static inline void iowrite64be_lo_hi(u64 val, void __iomem *addr) #ifndef ioread64 #define ioread64_is_nonatomic +#ifdef CONFIG_GENERIC_IOREMAP +#define ioread64 __ioread64_lo_hi +#else #define ioread64 ioread64_lo_hi #endif +#endif #ifndef iowrite64 #define iowrite64_is_nonatomic +#ifdef CONFIG_GENERIC_IOREMAP +#define ioread64 __iowrite64_lo_hi +#else #define iowrite64 iowrite64_lo_hi #endif +#endif #ifndef ioread64be #define ioread64be_is_nonatomic +#ifdef CONFIG_GENERIC_IOREMAP +#define ioread64 __ioread64be_lo_hi +#else #define ioread64be ioread64be_lo_hi #endif +#endif #ifndef iowrite64be #define iowrite64be_is_nonatomic +#ifdef CONFIG_GENERIC_IOREMAP +#define ioread64 __iowrite64be_lo_hi +#else #define iowrite64be iowrite64be_lo_hi #endif +#endif #endif /* _LINUX_IO_64_NONATOMIC_LO_HI_H_ */ diff --git a/lib/iomap.c b/lib/iomap.c index 4f8b31baa575..a65717cd86f7 100644 --- a/lib/iomap.c +++ b/lib/iomap.c @@ -111,7 +111,7 @@ EXPORT_SYMBOL(ioread16be); EXPORT_SYMBOL(ioread32); EXPORT_SYMBOL(ioread32be); -#ifdef readq +#ifdef CONFIG_64BIT static u64 pio_read64_lo_hi(unsigned long port) { u64 lo, hi; @@ -153,21 +153,21 @@ static u64 pio_read64be_hi_lo(unsigned long port) } __no_kmsan_checks -u64 ioread64_lo_hi(const void __iomem *addr) +u64 __ioread64_lo_hi(const void __iomem *addr) { IO_COND(addr, return pio_read64_lo_hi(port), return readq(addr)); return 0xffffffffffffffffULL; } __no_kmsan_checks -u64 ioread64_hi_lo(const void __iomem *addr) +u64 __ioread64_hi_lo(const void __iomem *addr) { IO_COND(addr, return pio_read64_hi_lo(port), return readq(addr)); return 0xffffffffffffffffULL; } __no_kmsan_checks -u64 ioread64be_lo_hi(const void __iomem *addr) +u64 __ioread64be_lo_hi(const void __iomem *addr) { IO_COND(addr, return pio_read64be_lo_hi(port), return mmio_read64be(addr)); @@ -175,19 +175,19 @@ u64 ioread64be_lo_hi(const void __iomem *addr) } __no_kmsan_checks -u64 ioread64be_hi_lo(const void __iomem *addr) +u64 __ioread64be_hi_lo(const void __iomem *addr) { IO_COND(addr, return pio_read64be_hi_lo(port), return mmio_read64be(addr)); return 0xffffffffffffffffULL; } -EXPORT_SYMBOL(ioread64_lo_hi); -EXPORT_SYMBOL(ioread64_hi_lo); -EXPORT_SYMBOL(ioread64be_lo_hi); -EXPORT_SYMBOL(ioread64be_hi_lo); +EXPORT_SYMBOL(__ioread64_lo_hi); +EXPORT_SYMBOL(__ioread64_hi_lo); +EXPORT_SYMBOL(__ioread64be_lo_hi); +EXPORT_SYMBOL(__ioread64be_hi_lo); -#endif /* readq */ +#endif /* CONFIG_64BIT */ #ifndef pio_write16be #define pio_write16be(val,port) outw(swab16(val),port) @@ -236,7 +236,7 @@ EXPORT_SYMBOL(iowrite16be); EXPORT_SYMBOL(iowrite32); EXPORT_SYMBOL(iowrite32be); -#ifdef writeq +#ifdef CONFIG_64BIT static void pio_write64_lo_hi(u64 val, unsigned long port) { outl(val, port); @@ -261,7 +261,7 @@ static void pio_write64be_hi_lo(u64 val, unsigned long port) pio_write32be(val, port + sizeof(u32)); } -void iowrite64_lo_hi(u64 val, void __iomem *addr) +void __iowrite64_lo_hi(u64 val, void __iomem *addr) { /* Make sure uninitialized memory isn't copied to devices. */ kmsan_check_memory(&val, sizeof(val)); @@ -269,7 +269,7 @@ void iowrite64_lo_hi(u64 val, void __iomem *addr) writeq(val, addr)); } -void iowrite64_hi_lo(u64 val, void __iomem *addr) +void __iowrite64_hi_lo(u64 val, void __iomem *addr) { /* Make sure uninitialized memory isn't copied to devices. */ kmsan_check_memory(&val, sizeof(val)); @@ -277,7 +277,7 @@ void iowrite64_hi_lo(u64 val, void __iomem *addr) writeq(val, addr)); } -void iowrite64be_lo_hi(u64 val, void __iomem *addr) +void __iowrite64be_lo_hi(u64 val, void __iomem *addr) { /* Make sure uninitialized memory isn't copied to devices. */ kmsan_check_memory(&val, sizeof(val)); @@ -285,7 +285,7 @@ void iowrite64be_lo_hi(u64 val, void __iomem *addr) mmio_write64be(val, addr)); } -void iowrite64be_hi_lo(u64 val, void __iomem *addr) +void __iowrite64be_hi_lo(u64 val, void __iomem *addr) { /* Make sure uninitialized memory isn't copied to devices. */ kmsan_check_memory(&val, sizeof(val)); @@ -293,12 +293,12 @@ void iowrite64be_hi_lo(u64 val, void __iomem *addr) mmio_write64be(val, addr)); } -EXPORT_SYMBOL(iowrite64_lo_hi); -EXPORT_SYMBOL(iowrite64_hi_lo); -EXPORT_SYMBOL(iowrite64be_lo_hi); -EXPORT_SYMBOL(iowrite64be_hi_lo); +EXPORT_SYMBOL(__iowrite64_lo_hi); +EXPORT_SYMBOL(__iowrite64_hi_lo); +EXPORT_SYMBOL(__iowrite64be_lo_hi); +EXPORT_SYMBOL(__iowrite64be_hi_lo); -#endif /* readq */ +#endif /* CONFIG_64BIT */ /* * These are the "repeat MMIO read/write" functions.