Re: [PATCH 2/7] pci: memory access API and IOMMU support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 2, 2010 at 9:49 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> On Thu, Sep 02, 2010 at 11:40:58AM +0300, Eduard - Gabriel Munteanu wrote:
>> On Thu, Sep 02, 2010 at 08:28:26AM +0300, Michael S. Tsirkin wrote:
>> > On Sat, Aug 28, 2010 at 05:54:53PM +0300, Eduard - Gabriel Munteanu wrote:
>> > > PCI devices should access memory through pci_memory_*() instead of
>> > > cpu_physical_memory_*(). This also provides support for translation and
>> > > access checking in case an IOMMU is emulated.
>> > >
>> > > Memory maps are treated as remote IOTLBs (that is, translation caches
>> > > belonging to the IOMMU-aware device itself). Clients (devices) must
>> > > provide callbacks for map invalidation in case these maps are
>> > > persistent beyond the current I/O context, e.g. AIO DMA transfers.
>> > >
>> > > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@xxxxxxxxxxx>
>> >
>> >
>> > I am concerned about adding more pointer chaising on data path.
>> > Could we have
>> > 1. an iommu pointer in a device, inherited by secondary buses
>> >    when they are created and by devices from buses when they are attached.
>> > 2. translation pointer in the iommu instead of the bus
>>
>> The first solution I proposed was based on qdev, that is, each
>> DeviceState had an 'iommu' field. Translation would be done by
>> recursively looking in the parent bus/devs for an IOMMU.
>>
>> But Anthony said we're better off with bus-specific APIs, mostly because
>> (IIRC) there may be different types of addresses and it might be
>> difficult to abstract those properly.
>
> Well we ended up with casting
> away types to make pci callbacks fit in ide structure,
> and silently assuming that all addresses are in fact 64 bit.
> So maybe it's hard to abstract addresses properly, but
> it appears we'll have to, to avoid even worse problems.
>
>> I suppose I could revisit the idea by integrating the IOMMU in a
>> PCIDevice as opposed to a DeviceState.
>>
>> Anthony, Paul, any thoughts on this?
>
> Just to clarify: this is an optimization idea:
> instead of a bus walk on each access, do the walk
> when device is attached to the bus, and copy the iommu
> from the root to the device itself.
>
> This will also make it possible to create
> DMADeviceState structure which would have this iommu field,
> and we'd use this structure instead of the void pointers all over.
>
>
>> > 3. pci_memory_XX functions inline, doing fast path for non-iommu case:
>> >
>> >     if (__builtin_expect(!dev->iommu, 1)
>> >             return cpu_memory_rw
>>
>> But isn't this some sort of 'if (likely(!dev->iommu))' from the Linux
>> kernel? If so, it puts the IOMMU-enabled case at disadvantage.
>
> IOMMU has a ton of indirections anyway.
>
>> I suppose most emulated systems would have at least some theoretical
>> reasons to enable the IOMMU, e.g. as a GART replacement (say for 32-bit
>> PCI devices) or for userspace drivers.
>> So there are reasons to enable
>> the IOMMU even when you don't have a real host IOMMU and you're not
>> using nested guests.
>
> The time most people enable iommu for all devices in both real and virtualized
> systems appears distant, one of the reasons is because it has a lot of overhead.
> Let's start with not adding overhead for existing users, makes sense?

I think the goal architecture (not for IOMMU, but in general) is one
with zero copy DMA. This means we have stage one where the addresses
are translated to host pointers and stage two where the read/write
operation happens. The devices need to be converted.

Now, let's consider the IOMMU in this zero copy architecture. It's one
stage of address translation, for the access operation it will not
matter. We can add translation caching at device level (or even at any
intermediate levels), but that needs a cache invalidation callback
system as discussed earlier. This can be implemented later, we need
the zero copy stuff first.

Given this overall picture, I think eliminating some pointer
dereference overheads in non-zero copy architecture is a very
premature optimization and it may even direct the architecture to
wrong direction.

If the performance degradation at this point is not acceptable, we
could also postpone merging IOMMU until zero copy conversion has
happened, or make IOMMU a compile time option. But it would be nice to
back the decision by performance figures.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux