[adding Alex as he has been interested in this in the past] On Mon, Mar 20, 2023 at 01:28:15PM +0100, Petr Tesarik wrote: > Second, on the Raspberry Pi 4, swiotlb is used by dma-buf for pages > moved from the rendering GPU (v3d driver), which can access all > memory, to the display output (vc4 driver), which is connected to a > bus with an address limit of 1 GiB and no IOMMU. These buffers can > be large (several megabytes) and cannot be handled by SWIOTLB, > because they exceed maximum segment size of 256 KiB. Such mapping > failures can be easily reproduced on a Raspberry Pi4: Starting > GNOME remote desktop results in a flood of kernel messages like > these: Shouldn't we make sure dma-buf allocates the buffers for the most restricted devices, and more importantly does something like a dma coherent allocation instead of a dynamic mapping of random memory? While a larger swiotlb works around this I don't think this fixes the root cause. > 1. The value is limited to ULONG_MAX, which is too little both for > physical addresses (e.g. x86 PAE or 32-bit ARM LPAE) and DMA > addresses (e.g. Xen guests on 32-bit ARM). > > 2. Since buffers are currently allocated with page granularity, a > PFN can be used instead. However, some values are reserved by > the maple tree implementation. Liam suggests to use > xa_mk_value() in that case, but that reduces the usable range by > half. Luckily, 31 bits are still enough to hold a PFN on all > 32-bit platforms. > > 3. Software IO TLB is used from interrupt context. The maple tree > implementation is not IRQ-safe (MT_FLAGS_LOCK_IRQ does nothing > AFAICS). Instead, I use an external lock, spin_lock_irqsave() and > spin_unlock_irqrestore(). > > Note that bounce buffers are never allocated dynamically if the > software IO TLB is in fact a DMA restricted pool, which is intended > to be stay in its designated location in physical memory. I'm a little worried about all that because it causes quite a bit of overhead even for callers that don't end up going into the dynamic range or do not use swiotlb at all. I don't really have a good answer here except for the usual avoid bounce buffering whenever you can that might not always be easy to do. > + gfp = (attrs & DMA_ATTR_MAY_SLEEP) ? GFP_KERNEL : GFP_NOWAIT; > + slot = kmalloc(sizeof(*slot), gfp | __GFP_NOWARN); > + if (!slot) > + goto err; > + > + slot->orig_addr = orig_addr; > + slot->alloc_size = alloc_size; > + slot->page = dma_direct_alloc_pages(dev, PAGE_ALIGN(alloc_size), > + &slot->dma_addr, dir, > + gfp | __GFP_NOWARN); > + if (!slot->page) > + goto err_free_slot; Without GFP_NOIO allocations this will deadlock eventually.