On Wednesday 29 June 2005 02:15, Bryan J. Smith wrote: > On Wed, 2005-06-29 at 00:26 -0400, Peter Arremann wrote: > > RedHat and others don't want to have to support two separate kernels - so > > they limit IO to the lowest 4GB no matter if you're running an Opteron > > or EM64T. > > On Wed, 2005-06-29 at 00:01 -0500, Bryan J. Smith wrote: > > ? I was unaware this is how they handled Opteron. I thought Red Hat > > _dynamically_ handled EM64T separately in their x86-64 kernels, and that > > was a major performance hit. > > Looking again at the release notes ... > > http://www.centos.org/docs/3/release-notes/as-amd64/RELEASE-NOTES-U2- > x86_64-en.html#id3938207 Yes, I used that URL before as well :-) I interpreted it different though as it being implemented for both... pci-gart.c shows in its init function iommu_setup where the initialization is done... the only place where its called is in setup.c where there is a define around it with CONFIG_GART_IOMMU ... That's set to yes, so that code is compiled in. In pci-gart.c you can see that the string "soft" would have to be passed to that function for Intel software mmu... I didn't have time to track down where it goes from there - my vacation is over and I need to get back to work... > >From the looks of it, it's not just whether memory mapped I/O is above > > 4GiB, but _any_ direct memory access (DMA) by a device where either the > source or destination is above 4GiB. I.e., the memory mapped I/O might > be below 4GiB, but the device might be executing a DMA transfer to user > memory above 4GiB. > > That's where the "Software IOTLB" comes in, _only_enabled_ on EM64T. > > If I remember back to the March 2004 onward threads on the LKML, that's > how they dealt with it -- using pre-allocated kernel bounced buffers > below 4GiB. A Linux/x86-64 kernel _always_ uses an I/O MMU -- it is > just software for EM64T if either the source or destination address of a > DMA transfer is above 4GiB. > > I don't think it really matters where the memory mapped I/O is itself. > Although it obviously is advantageous if it is setup under 4GiB on EM64T > -- because it would only need the "bounce buffers" when a DMA transfer > is to user memory above 4GiB, instead of _always_ if the memory mapped > I/O was above 4GiB. Yes - the question was about bounce buffers... You need them for DMA access like I said before - and if Intel had implemented an IO mmu, you wouldn't need it there either. Peter.