On 2/15/23 7:03 AM, Niklas Schnelle wrote: > Flush queues currently use a fixed compile time size of 256 entries. > This being a power of 2 allows the compiler to use shift and mask > instead of more expensive modulo operations. With per-CPU flush queues > larger queue sizes would hit per-CPU allocation limits, with a single > flush queue these limits do not apply however. Also with single queues > being particularly suitable for virtualized environments with expensive > IOTLB flushes these benefit especially from larger queues and thus fewer > flushes. > > To this end re-order struct iova_fq so we can use a dynamic array and > introduce the flush queue size and timeouts as new options in the > dma_iommu_options struct. So as not to lose the shift and mask > optimization, check that the variable length is a power of 2 and use > explicit shift and mask instead of letting the compiler optimize this. > > In the s390 IOMMU driver a large fixed queue size and timeout is then > set together with single queue mode bringing its performance on s390 > paged memory guests on par with the previous s390 specific DMA API > implementation. > > Signed-off-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx> Reviewed-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx> #s390 > +#define S390_IOMMU_SINGLE_FQ_SIZE 32768 > +#define S390_IOMMU_SINGLE_FQ_TIMEOUT 1000 > + One question about these values however, was there a rationale to choosing these particular numbers (anything worth documenting?) or were they were simply chosen because they showed similar characteristics to the previous DMA approach? I'm mostly wondering if it's worth experimenting with other values here in the future to see what kind of impact it would have.