Re: [PATCH v3 1/1] Documentation/core-api: Add swiotlb documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 1 May 2024 06:49:10 +0200
Christoph Hellwig <hch@xxxxxx> wrote:

> On Tue, Apr 30, 2024 at 01:24:13PM +0200, Petr Tesařík wrote:
> > > +swiotlb was originally created to handle DMA for devices with addressing
> > > +limitations. As physical memory sizes grew beyond 4 GiB, some devices could
> > > +only provide 32-bit DMA addresses. By allocating bounce buffer memory below
> > > +the 4 GiB line, these devices with addressing limitations could still work and
> > > +do DMA.  
> > 
> > IIRC the origins are even older and bounce buffers were used to
> > overcome the design flaws inherited all the way from the original IBM
> > PC.  
> 
> [correct, but for swiotlb largely irrelevant PC addressing bits snipped]
> 
> swiotlb was added with the merge of the ia64 port to address 32-bit
> addressing limitations.  The 1MB addressing limitations of the PC did
> and still do of course exist, but weren't dealt with in any coherent
> fashion, and still aren't.  Swiotlb isn't related to them.

Thanks for correcting me. Oh, and this is probably why some drivers did
their own bounce buffering. I was mistaken that swiotlb was supposed to
clean up the existing mess...

> > > +data to/from the original target memory buffer. The CPU copying bridges between
> > > +the unencrypted and the encrypted memory. This use of bounce buffers allows
> > > +existing device drivers to "just work" in a CoCo VM, with no modifications
> > > +needed to handle the memory encryption complexity.  
> > 
> > This part might be misleading. It sounds as if SWIOTLB would not be
> > needed if drivers were smarter. But IIUC that's not the case. SWIOTLB
> > is used for streaming DMA, where device drivers have little control
> > over the physical placement of a DMA buffer. For example, when a
> > process allocates some memory, the kernel cannot know that this memory
> > will be later passed to a write(2) syscall to do direct I/O of a
> > properly aligned buffer that can go all the way down to the NVMe driver
> > with zero copy.  
> 
> I think the statement in the text is fine and easy to understand.  CoCo
> drivers could instead always map the memory unencrypted (which would have
> no so nice security and performance properties) or use fixed ringbuffers
> in shared unencrypted memory (which would require a different driver
> architecture).
> 
> > > +block. Hence the default memory pool for swiotlb allocations must be
> > > +pre-allocated at boot time (but see Dynamic swiotlb below). Because swiotlb
> > > +allocations must be physically contiguous, the entire default memory pool is
> > > +allocated as a single contiguous block.  
> > 
> > Allocations must be contiguous in target device's DMA address space. In
> > practice this is achieved by being contiguous in CPU physical address
> > space (aka "physically contiguous"), but there might be subtle
> > differences, e.g. in a virtualized environment.
> > 
> > Now that I'm thinking about it, leave the paragraph as is, and I'll
> > update it if I write the code for it.  
> 
> Heh.  The only think making cpu non-contiguous address space contiguous
> for a device is an iommu.  And when we have that we only use swiotlb
> for unaligned iommu pages, so I'm not sure how we'd ever get there.

Yes, there's no way to make CPU non-contiguous addresses contiguous for
a device (except with IOMMU), but there are some real-world bus bridges
that make a CPU contiguous address range non-contiguous for a target
device, most often by limiting the address width and overflowing at the
correspondign boundary.

This is moot anyway, because I suggest to leave the paragraph as is.

Petr T





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux