Hi Tony, On Tue, 8 Nov 2022 18:07:37 -0500, Tony Battersby wrote: > (resending in plaintext; damn Thunderbird upgrades...) > > While looking up documentation for PCI write posting I noticed that the > example in Documentation/driver-api/io_ordering.rst seems to contradict > Documentation/memory-barriers.txt: > > ----------- > > Documentation/driver-api/io_ordering.rst: > > On some platforms, so-called memory-mapped I/O is weakly ordered. On such > platforms, driver writers are responsible for ensuring that I/O writes to > memory-mapped addresses on their device arrive in the order intended. This is > typically done by reading a 'safe' device or bridge register, causing the I/O > chipset to flush pending writes to the device before any reads are posted. A > driver would usually use this technique immediately prior to the exit of a > critical section of code protected by spinlocks. This would ensure that > subsequent writes to I/O space arrived only after all prior writes (much like a > memory barrier op, mb(), only with respect to I/O). > > A more concrete example from a hypothetical device driver:: > > ... > CPU A: spin_lock_irqsave(&dev_lock, flags) > CPU A: val = readl(my_status); > CPU A: ... > CPU A: writel(newval, ring_ptr); > CPU A: spin_unlock_irqrestore(&dev_lock, flags) > ... > CPU B: spin_lock_irqsave(&dev_lock, flags) > CPU B: val = readl(my_status); > CPU B: ... > CPU B: writel(newval2, ring_ptr); > CPU B: spin_unlock_irqrestore(&dev_lock, flags) > ... > > In the case above, the device may receive newval2 before it receives newval, > which could cause problems. Fixing it is easy enough though:: > > ... > CPU A: spin_lock_irqsave(&dev_lock, flags) > CPU A: val = readl(my_status); > CPU A: ... > CPU A: writel(newval, ring_ptr); > CPU A: (void)readl(safe_register); /* maybe a config register? */ > CPU A: spin_unlock_irqrestore(&dev_lock, flags) > ... > CPU B: spin_lock_irqsave(&dev_lock, flags) > CPU B: val = readl(my_status); > CPU B: ... > CPU B: writel(newval2, ring_ptr); > CPU B: (void)readl(safe_register); /* maybe a config register? */ > CPU B: spin_unlock_irqrestore(&dev_lock, flags) > > Here, the reads from safe_register will cause the I/O chipset to flush any > pending writes before actually posting the read to the chipset, preventing > possible data corruption. > > ----------- > > Documentation/memory-barriers.txt: > > ========================== > KERNEL I/O BARRIER EFFECTS > ========================== > > Interfacing with peripherals via I/O accesses is deeply architecture and device > specific. Therefore, drivers which are inherently non-portable may rely on > specific behaviours of their target systems in order to achieve synchronization > in the most lightweight manner possible. For drivers intending to be portable > between multiple architectures and bus implementations, the kernel offers a > series of accessor functions that provide various degrees of ordering > guarantees: > > (*) readX(), writeX(): > > The readX() and writeX() MMIO accessors take a pointer to the > peripheral being accessed as an __iomem * parameter. For pointers > mapped with the default I/O attributes (e.g. those returned by > ioremap()), the ordering guarantees are as follows: > > 1. All readX() and writeX() accesses to the same peripheral are ordered > with respect to each other. This ensures that MMIO register accesses > by the same CPU thread to a particular device will arrive in program > order. > > 2. A writeX() issued by a CPU thread holding a spinlock is ordered > before a writeX() to the same peripheral from another CPU thread > issued after a later acquisition of the same spinlock. This ensures > that MMIO register writes to a particular device issued while holding > a spinlock will arrive in an order consistent with acquisitions of > the lock. > > ----------- > > To summarize: > > io_ordering.rst says to use readX() before spin_unlock to order writeX() > calls (on some platforms). > > memory-barriers.txt says writeX() calls are already ordered when holding > the same spinlock. > > So...which one is correct? >From quick glance of io_ordering.rst's git history, contents of this file is never updated since the beginning of Git history (v2.6.12.rc2). Which strongly suggests that you can ignore io_ordering.rst. Thanks, Akira PS: Do we need to keep that outdated document??? I think Documentation/driver-api/device-io.rst is the one properly maintained. > > > There is another example of flushing posted PCI writes at the bottom of > Documentation/PCI/pci.rst, but that one is consistent with > memory-barriers.txt. > > Tony Battersby > Cybernetics >