On Thu, Mar 23, 2023 at 04:34:36PM +0100, Sebastian Andrzej Siewior wrote: > From: Haris Okanovic <haris.okanovic@xxxxxx> > > ioread8() operations to TPM MMIO addresses can stall the CPU when > immediately following a sequence of iowrite*()'s to the same region. > > For example, cyclitest measures ~400us latency spikes when a non-RT > usermode application communicates with an SPI-based TPM chip (Intel Atom > E3940 system, PREEMPT_RT kernel). The spikes are caused by a > stalling ioread8() operation following a sequence of 30+ iowrite8()s to > the same address. I believe this happens because the write sequence is > buffered (in CPU or somewhere along the bus), and gets flushed on the > first LOAD instruction (ioread*()) that follows. > > The enclosed change appears to fix this issue: read the TPM chip's > access register (status code) after every iowrite*() operation to > amortize the cost of flushing data to chip across multiple instructions. > > Signed-off-by: Haris Okanovic <haris.okanovic@xxxxxx> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > --- > > I don't know how performance critical this is so that it should be > restricted to PREEMPT_RT. This has been in RT queue since late 2017 and > I have no idea how to deal with this differently/ in a more generic way. > Original thread: > https://lore.kernel.org/20170804215651.29247-1-haris.okanovic@xxxxxx > > drivers/char/tpm/tpm_tis.c | 29 +++++++++++++++++++++++++++-- > 1 file changed, 27 insertions(+), 2 deletions(-) > > diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c > index ed5dabd3c72d6..513e0d1c349a6 100644 > --- a/drivers/char/tpm/tpm_tis.c > +++ b/drivers/char/tpm/tpm_tis.c > @@ -50,6 +50,31 @@ static inline struct tpm_tis_tcg_phy *to_tpm_tis_tcg_phy(struct tpm_tis_data *da > return container_of(data, struct tpm_tis_tcg_phy, priv); > } > > +#ifdef CONFIG_PREEMPT_RT > +/* > + * Flushes previous write operations to chip so that a subsequent > + * ioread*()s won't stall a CPU. > + */ I would replace this with: /* * Flush previous write operations with a dummy read operation to the * TPM MMIO base address. */ I think rest of the reasoning would be better place to the functions, which are call sites for this helper, and here it would be make more sense to explain what it actually does. > +static inline void tpm_tis_flush(void __iomem *iobase) > +{ > + ioread8(iobase + TPM_ACCESS(0)); > +} > +#else > +#define tpm_tis_flush(iobase) do { } while (0) > +#endif > + > +static inline void tpm_tis_iowrite8(u8 b, void __iomem *iobase, u32 addr) /* * Write a byte to the TPM MMIO address, and flush the write queue. The * flush amortizes the cost of the IO operations, and thus avoids unwanted * latency peaks. */ > +{ > + iowrite8(b, iobase + addr); > + tpm_tis_flush(iobase); > +} > + /* * Write a 32-bit word to the TPM MMIO address, and flush the write queue. * The flush amortizes the cost of the IO operations, and thus avoids * unwanted latency peaks. * > +static inline void tpm_tis_iowrite32(u32 b, void __iomem *iobase, u32 addr) > +{ > + iowrite32(b, iobase + addr); > + tpm_tis_flush(iobase); > +} > + > static int interrupts = -1; > module_param(interrupts, int, 0444); > MODULE_PARM_DESC(interrupts, "Enable interrupts"); > @@ -186,12 +211,12 @@ static int tpm_tcg_write_bytes(struct tpm_tis_data *data, u32 addr, u16 len, > switch (io_mode) { > case TPM_TIS_PHYS_8: > while (len--) > - iowrite8(*value++, phy->iobase + addr); > + tpm_tis_iowrite8(*value++, phy->iobase, addr); > break; > case TPM_TIS_PHYS_16: > return -EINVAL; > case TPM_TIS_PHYS_32: > - iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase + addr); > + tpm_tis_iowrite32(le32_to_cpu(*((__le32 *)value)), phy->iobase, addr); > break; > } > > -- > 2.40.0 > Thanks for catching this up. It is a small code change but I think that it would deserve just a bit more documentation, as it would make sure that the reasoning you gave is taken into account in the future code reviews. I think that if you append what I suggested above (you can use your judgement and edit as you will), it should be sufficient. BR, Jarkko BR, Jarkko