Re: [PATCH 4/7] ide: use the PCI memory access interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 02, 2010 at 12:58:13PM +0300, Michael S. Tsirkin wrote:
> On Thu, Sep 02, 2010 at 12:12:00PM +0300, Eduard - Gabriel Munteanu wrote:
> > On Thu, Sep 02, 2010 at 08:19:11AM +0300, Michael S. Tsirkin wrote:
> > > On Sat, Aug 28, 2010 at 05:54:55PM +0300, Eduard - Gabriel Munteanu wrote:
> 
> I don't insist on this solution, but what other way do you propose to
> avoid the overhead for everyone not using an iommu?
> I'm all for a solution that would help iommu as well,
> but one wasn't yet proposed.
> 

Hm, we could get even better performance by simply making the IOMMU a
compile-time option. It also avoids problems in case some device hasn't
been converted yet, and involves little to no tradeoffs. What do you
think?

AFAICT, there are few uses for the IOMMU besides development and
avantgarde stuff, as you note. So distributions can continue supplying
prebuilt QEMU/KVM packages compiled with the IOMMU turned off for the
time being. The only practical (commercial) use right now would be in
the case of private virtual servers, which could be divided further into
nested guests (though real IOMMU hardware isn't widespread yet).

Blue Swirl, in the light of this, do you agree on making it a
compile-time option?

> > > >  static inline IDEState *idebus_active_if(IDEBus *bus)
> > > >  {
> > > >      return bus->ifs + bus->unit;
> > > > diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> > > > index bd1c73e..962ae13 100644
> > > > --- a/hw/ide/macio.c
> > > > +++ b/hw/ide/macio.c
> > > > @@ -79,7 +79,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
> > > >  
> > > >      s->io_buffer_size = io->len;
> > > >  
> > > > -    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > +    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > >      qemu_sglist_add(&s->sg, io->addr, io->len);
> > > >      io->addr += io->len;
> > > >      io->len = 0;
> > > > @@ -141,7 +141,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
> > > >      s->io_buffer_index = 0;
> > > >      s->io_buffer_size = io->len;
> > > >  
> > > > -    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
> > > > +    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL, NULL, NULL);
> > > >      qemu_sglist_add(&s->sg, io->addr, io->len);
> > > >      io->addr += io->len;
> > > >      io->len = 0;
> > > > diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> > > > index 4d95cc5..5879044 100644
> > > > --- a/hw/ide/pci.c
> > > > +++ b/hw/ide/pci.c
> > > > @@ -183,4 +183,11 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table)
> > > >              continue;
> > > >          ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]);
> > > >      }
> > > > +
> > > > +    for (i = 0; i < 2; i++) {
> > > > +        d->bmdma[i].rw = (void *) pci_memory_rw;
> > > > +        d->bmdma[i].map = (void *) pci_memory_map;
> > > > +        d->bmdma[i].unmap = (void *) pci_memory_unmap;
> > > > +        d->bmdma[i].opaque = dev;
> > > > +    }
> > > >  }
> > > 
> > > These casts show something is wrong with the API, IMO.
> > > 
> > 
> > Hm, here's an oversight on my part: I think I should provide explicit
> > bmdma hooks, since pcibus_t is a uint64_t and target_phys_addr_t is a
> > uint{32,64}_t depending on the guest machine, so it might be buggy on
> > 32-bit wrt calling conventions. But that introduces yet another
> > non-inlined function call :-(. That would drop the (void *) cast,
> > though.
> > 
> > 
> > 	Eduard
> 
> So we get away with it without casts but only because C compiler
> will let us silently convert the types, possibly discarding
> data in the process. Or we'll add a check that will try and detect
> this, but there's no good way to report a DMA error to user.
> IOW, if our code only works because target fits in pcibus, what good
> is the abstraction and using distinct types?
> 
> This is why I think we need a generic DMA APIs using dma addresses.

The API was made so that it doesn't report errors. That's because the
PCI bus doesn't provide any possibility of doing so (real devices can't
retry transfers in case an I/O page fault occurs).

In my previous generic IOMMU layer implementation pci_memory_*()
returned non-zero on failure, but I decided to drop it when switching to
a PCI-only (rather a PCI-specific) approach.

In case target_phys_addr_t no longer fits in pcibus_t by a simple
implicit conversion, those explicit bmdma hooks I was going to add will
do the necessary conversions.

The idea of using distinct types is two-fold: let the programmer know
not to rely on them being the same thing, and let the compiler prevent
him from shooting himself in the foot (like I did). Even if there is a
dma_addr_t, some piece of code still needs to provide glue and
conversion between the DMA code and bus-specific code.


	Eduard

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux