On Thu, 22 Aug 2024 11:37:13 -0700 mhkelley58@xxxxxxxxx wrote: > From: Michael Kelley <mhklinux@xxxxxxxxxxx> > > When a DMA map request is for a SGL, each SGL entry results in an > independent mapping operation. If the mapping requires a bounce buffer > due to running in a CoCo VM or due to swiotlb=force on the boot line, > swiotlb is invoked. If swiotlb throttling is enabled for the request, > each SGL entry results in a separate throttling operation. This is > problematic because a thread may be holding swiotlb memory while waiting > for memory to become free. > > Resolve this problem by only allowing throttling on the 0th SGL > entry. When unmapping the SGL, unmap entries 1 thru N-1 first, then > unmap entry 0 so that the throttle isn't released until all swiotlb > memory has been freed. > > Signed-off-by: Michael Kelley <mhklinux@xxxxxxxxxxx> > --- > This approach to SGLs muddies the line between DMA direct and swiotlb > throttling functionality. To keep the MAY_BLOCK attr fully generic, it > should propagate to the mapping of all SGL entries. > > An alternate approach is to define an additional DMA attribute that > is internal to the DMA layer. Instead of clearing MAX_BLOCK, this > attr is added by dma_direct_map_sg() when mapping SGL entries other > than the 0th entry. swiotlb would do throttling only when MAY_BLOCK > is set and this new attr is not set. > > This approach has a modest amount of additional complexity. Given > that we currently have no other users of the MAY_BLOCK attr, the > conceptual cleanliness may not be warranted until we do. > > Thoughts? If we agree to change the unthrottling logic (see my comment to your RFC 1/7), we'll need an additional attribute to delay unthrottling when unmapping sg list entries 1 to N-1. This attribute could convey that the mapping is the non-initial segment of an sg list and it could then be also used to disable blocking in swiotlb_tbl_map_single(). > > kernel/dma/direct.c | 35 ++++++++++++++++++++++++++++++----- > 1 file changed, 30 insertions(+), 5 deletions(-) > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c > index 4480a3cd92e0..80e03c0838d4 100644 > --- a/kernel/dma/direct.c > +++ b/kernel/dma/direct.c > @@ -438,6 +438,18 @@ void dma_direct_sync_sg_for_cpu(struct device *dev, > arch_sync_dma_for_cpu_all(); > } > > +static void dma_direct_unmap_sgl_entry(struct device *dev, > + struct scatterlist *sgl, enum dma_data_direction dir, Nitpick: This parameter should probably be called "sg", because it is never used to do any operation on the whole list. Similarly, the function could be called dma_direct_unmap_sg_entry(), because there is no dma_direct_unmap_sgl() either... > + unsigned long attrs) > + > +{ > + if (sg_dma_is_bus_address(sgl)) > + sg_dma_unmark_bus_address(sgl); > + else > + dma_direct_unmap_page(dev, sgl->dma_address, > + sg_dma_len(sgl), dir, attrs); > +} > + > /* > * Unmaps segments, except for ones marked as pci_p2pdma which do not > * require any further action as they contain a bus address. > @@ -449,12 +461,20 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl, > int i; > > for_each_sg(sgl, sg, nents, i) { > - if (sg_dma_is_bus_address(sg)) > - sg_dma_unmark_bus_address(sg); > - else > - dma_direct_unmap_page(dev, sg->dma_address, > - sg_dma_len(sg), dir, attrs); > + /* > + * Skip the 0th SGL entry in case this SGL consists of > + * throttled swiotlb mappings. In such a case, any other > + * entries should be unmapped first since unmapping the > + * 0th entry will release the throttle semaphore. > + */ > + if (!i) > + continue; > + dma_direct_unmap_sgl_entry(dev, sg, dir, attrs); > } > + > + /* Now do the 0th SGL entry */ > + if (nents) I wonder if nents can ever be zero here, but it's nowhere enforced and dma_map_sg_attrs() is exported, so I agree, let's play it safe. > + dma_direct_unmap_sgl_entry(dev, sgl, dir, attrs); > } > #endif > > @@ -492,6 +512,11 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents, > ret = -EIO; > goto out_unmap; > } > + > + /* Allow only the 0th SGL entry to block */ > + if (!i) Are you sure? I think the modified value of attrs is first used in the next loop iteration, so the conditional should be removed, or else both segment index 0 and 1 will keep the flag. Petr T > + attrs &= ~DMA_ATTR_MAY_BLOCK; > + > sg_dma_len(sg) = sg->length; > } >