[OT] Memory Models and Multi/Virtual-Cores -- Software IOTLB

loony at loonybin.org (Peter Arremann) · Wed Jun 29 14:20:04 2005

On Wednesday 29 June 2005 02:15, Bryan J. Smith wrote:
> On Wed, 2005-06-29 at 00:26 -0400, Peter Arremann wrote:
> > RedHat and others don't want to have to support two separate kernels - so
> > they limit IO to the lowest 4GB no matter if you're running an Opteron
> > or EM64T.
>
> On Wed, 2005-06-29 at 00:01 -0500, Bryan J. Smith wrote:
> > ?  I was unaware this is how they handled Opteron.  I thought Red Hat
> > _dynamically_ handled EM64T separately in their x86-64 kernels, and that
> > was a major performance hit.
>
> Looking again at the release notes ...
>
> http://www.centos.org/docs/3/release-notes/as-amd64/RELEASE-NOTES-U2-
> x86_64-en.html#id3938207
Yes, I used that URL before as well  :-) I interpreted it different though as 
it being implemented for both... pci-gart.c  shows in its init function 
iommu_setup where the initialization is done... the only place where its 
called is in setup.c where there is a define around it with CONFIG_GART_IOMMU
... That's set to yes, so that code is compiled in. In pci-gart.c you can see 
that the string "soft" would have to be passed to that function for Intel 
software mmu... I didn't have time to track down where it goes from there - 
my vacation is over and I need to get back to work... 

> >From the looks of it, it's not just whether memory mapped I/O is above
>
> 4GiB, but _any_ direct memory access (DMA) by a device where either the
> source or destination is above 4GiB.  I.e., the memory mapped I/O might
> be below 4GiB, but the device might be executing a DMA transfer to user
> memory above 4GiB.
>
> That's where the "Software IOTLB" comes in, _only_enabled_ on EM64T.
>
> If I remember back to the March 2004 onward threads on the LKML, that's
> how they dealt with it -- using pre-allocated kernel bounced buffers
> below 4GiB.  A Linux/x86-64 kernel _always_ uses an I/O MMU -- it is
> just software for EM64T if either the source or destination address of a
> DMA transfer is above 4GiB.
>
> I don't think it really matters where the memory mapped I/O is itself.
> Although it obviously is advantageous if it is setup under 4GiB on EM64T
> -- because it would only need the "bounce buffers" when a DMA transfer
> is to user memory above 4GiB, instead of _always_ if the memory mapped
> I/O was above 4GiB.

Yes - the question was about bounce buffers... You need them for DMA access 
like I said before - and if Intel had implemented an IO mmu, you wouldn't 
need it there either. 

Peter.