From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Sent: Thursday, July 6, 2023 1:07 AM > > On Thu, Jul 06, 2023 at 03:50:55AM +0000, Michael Kelley (LINUX) wrote: > > From: Petr Tesarik <petrtesarik@xxxxxxxxxxxxxxx> Sent: Tuesday, June 27, 2023 > 2:54 AM > > > > > > Try to allocate a transient memory pool if no suitable slots can be found, > > > except when allocating from a restricted pool. The transient pool is just > > > enough big for this one bounce buffer. It is inserted into a per-device > > > list of transient memory pools, and it is freed again when the bounce > > > buffer is unmapped. > > > > > > Transient memory pools are kept in an RCU list. A memory barrier is > > > required after adding a new entry, because any address within a transient > > > buffer must be immediately recognized as belonging to the SWIOTLB, even if > > > it is passed to another CPU. > > > > > > Deletion does not require any synchronization beyond RCU ordering > > > guarantees. After a buffer is unmapped, its physical addresses may no > > > longer be passed to the DMA API, so the memory range of the corresponding > > > stale entry in the RCU list never matches. If the memory range gets > > > allocated again, then it happens only after a RCU quiescent state. > > > > > > Since bounce buffers can now be allocated from different pools, add a > > > parameter to swiotlb_alloc_pool() to let the caller know which memory pool > > > is used. Add swiotlb_find_pool() to find the memory pool corresponding to > > > an address. This function is now also used by is_swiotlb_buffer(), because > > > a simple boundary check is no longer sufficient. > > > > > > The logic in swiotlb_alloc_tlb() is taken from __dma_direct_alloc_pages(), > > > simplified and enhanced to use coherent memory pools if needed. > > > > > > Note that this is not the most efficient way to provide a bounce buffer, > > > but when a DMA buffer can't be mapped, something may (and will) actually > > > break. At that point it is better to make an allocation, even if it may be > > > an expensive operation. > > > > I continue to think about swiotlb memory management from the standpoint > > of CoCo VMs that may be quite large with high network and storage loads. > > These VMs are often running mission-critical workloads that can't tolerate > > a bounce buffer allocation failure. To prevent such failures, the swiotlb > > memory size must be overly large, which wastes memory. > > If "mission critical workloads" are in a vm that allowes overcommit and > no control over other vms in that same system, then you have worse > problems, sorry. > > Just don't do that. > No, the cases I'm concerned about don't involve memory overcommit. CoCo VMs must use swiotlb bounce buffers to do DMA I/O. Current swiotlb code in the Linux guest allocates a configurable, but fixed, amount of guest memory at boot time for this purpose. But it's hard to know how much swiotlb bounce buffer memory will be needed to handle peak I/O loads. This patch set does dynamic allocation of swiotlb bounce buffer memory, which can help avoid needing to configure an overly large fixed size at boot. Michael