On Mon, 2023-05-22 at 17:26 +0100, Robin Murphy wrote: > On 2023-05-15 10:15, Niklas Schnelle wrote: > > In some virtualized environments, including s390 paged memory guests, > > IOTLB flushes are used to update IOMMU shadow tables. Due to this, they > > are much more expensive than in typical bare metal environments or > > non-paged s390 guests. In addition they may parallelize more poorly in > > virtualized environments. This changes the trade off for flushing IOVAs > > such that minimizing the number of IOTLB flushes trumps any benefit of > > cheaper queuing operations or increased paralellism. > > > > In this scenario per-CPU flush queues pose several problems. Firstly > > per-CPU memory is often quite limited prohibiting larger queues. > > Secondly collecting IOVAs per-CPU but flushing via a global timeout > > reduces the number of IOVAs flushed for each timeout especially on s390 > > where PCI interrupts may not be bound to a specific CPU. > > > > Let's introduce a single flush queue mode that reuses the same queue > > logic but only allocates a single global queue. This mode can be > > selected as a flag bit in a new dma_iommu_options struct which can be > > modified from its defaults by IOMMU drivers implementing a new > > ops.tune_dma_iommu() callback. As a first user the s390 IOMMU driver > > selects the single queue mode if IOTLB flushes are needed on map which > > indicates shadow table use. With the unchanged small FQ size and > > timeouts this setting is worse than per-CPU queues but a follow up patch > > will make the FQ size and timeout variable. Together this allows the > > common IOVA flushing code to more closely resemble the global flush > > behavior used on s390's previous internal DMA API implementation. > > > > Link: https://lore.kernel.org/linux-iommu/3e402947-61f9-b7e8-1414-fde006257b6f@xxxxxxx/ > > Reviewed-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx> #s390 > > Signed-off-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx> > > --- > > drivers/iommu/dma-iommu.c | 163 ++++++++++++++++++++++++++++++++++----------- > > drivers/iommu/dma-iommu.h | 4 +- > > drivers/iommu/iommu.c | 18 +++-- > > drivers/iommu/s390-iommu.c | 10 +++ > > include/linux/iommu.h | 21 ++++++ > > 5 files changed, 169 insertions(+), 47 deletions(-) > > ---8<--- > > > > +/** > > + * struct dma_iommu_options - Options for dma-iommu > > + * > > + * @flags: Flag bits for enabling/disabling dma-iommu settings > > + * > > + * This structure is intended to provide IOMMU drivers a way to influence the > > + * behavior of the dma-iommu DMA API implementation. This allows optimizing for > > + * example for a virtualized environment with slow IOTLB flushes. > > + */ > > +struct dma_iommu_options { > > +#define IOMMU_DMA_OPTS_PER_CPU_QUEUE (0L << 0) > > +#define IOMMU_DMA_OPTS_SINGLE_QUEUE (1L << 0) > > + u64 flags; > > +}; > > I think for now this can just use a bit in dev_iommu to indicate that > the device will prefer a global flush queue; s390 can set that in > .probe_device, then iommu_dma_init_domain() can propagate it to an > equivalent flag in the cookie (possibly even a new cookie type?) that > iommu_dma_init_fq() can then consume. Then just make the s390 parameters > from patch #6 the standard parameters for a global queue. > > Thanks, > Robin. Working on this now. How about I move the struct dma_iommu_options definition into dma-iommu.c keeping it as part of struct iommu_dma_cookie. That way we can still have the flags, timeout and queue size organized the same but internal to dma-iommu.c. We then set them in iommu_dma_init_domain() triggered by a "shadow_on_flush" flag in struct dev_iommu. That way we can keep most of the same code but only add a single flag as external interface. The flag would also be an explicit fact about a distinctly IOMMU device thing just stating that the IOTLB flushes do extra shadowing work. This leaves the decision to then use a longer timeout and queue size within the responsibility of dma-iommu.c. I think that's overall a better match of responsibilities. Thanks, Niklas