Re: [PATCH RFC] dma_rw.h (was Re: [PATCH 0/7] AMD IOMMU emulation patchset v4)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 16, 2010 at 10:06:16AM +0300, Eduard - Gabriel Munteanu wrote:
> On Mon, Sep 13, 2010 at 10:01:20PM +0200, Michael S. Tsirkin wrote:
> > So I think the following will give the idea of what an API
> > might look like that will let us avoid the scary hacks in
> > e.g. the ide layer and other generic layers that need to do DMA,
> > without either binding us to pci, adding more complexity with
> > callbacks, or losing type safety with casts and void*.
> > 
> > Basically we have DMADevice that we can use container_of on
> > to get a PCIDevice from, and DMAMmu that will get instanciated
> > in a specific MMU.
> > 
> > This is not complete code - just a header - I might complete
> > this later if/when there's interest or hopefully someone interested
> > in iommu emulation will.
> 
> Hi,
> 
> I personally like this approach better. It also seems to make poisoning
> cpu_physical_memory_*() easier if we convert every device to this API.
> We could then ban cpu_physical_memory_*(), perhaps by requiring a
> #define and #ifdef-ing those declarations.
> 
> > Notes:
> > the IOMMU_PERM_RW code seem unused, so I replaced
> > this with plain is_write. Is it ever useful?
> 
> The original idea made provisions for stuff like full R/W memory maps.
> In that case cpu_physical_memory_map() would call the translation /
> checking function with perms == IOMMU_PERM_RW. That's not there yet so
> it can be removed at the moment, especially since it only affects these
> helpers.
> 
> Also, I'm not sure if there are other sorts of accesses besides reads
> and writes we want to check or translate.
> 
> > It seems that invalidate callback should be able to
> > get away with just a device, so I switched to that
> > from a void pointer for type safety.
> > Seems enough for the users I saw.
> 
> I think this makes matters too complicated. Normally, a single DMADevice
> should be embedded within a <bus>Device,

No, DMADevice is a device that does DMA.
So e.g. a PCI device would embed one.
Remember, traslations are per device, right?
DMAMmu is part of the iommu object.

> so doing this makes it really
> hard to invalidate a specific map when there are more of them. It forces
> device code to act as a bus, provide fake 'DMADevice's for each map and
> dispatch translation to the real DMATranslateFunc. I see no other way.
> 
> If you really want more type-safety (although I think this is a case of
> a true opaque identifying something only device code understands), I
> have another proposal: have a DMAMap embedded in the opaque. Example
> from dma-helpers.c:
> 
> typedef struct {
> 	DMADevice *owner;
> 	[...]
> } DMAMap;
> 
> typedef struct {
> 	[...]
> 	DMAMap map;
> 	[...]
> } DMAAIOCB;
> 
> /* The callback. */
> static void dma_bdrv_cancel(DMAMap *map)
> {
> 	DMAAIOCB *dbs = container_of(map, DMAAIOCB, map);
> 
> 	[...]
> }
> 
> The upside is we only need to pass the DMAMap. That can also contain
> details of the actual map in case the device wants to release only the
> relevant range and remap the rest.

Fine.
Or maybe DMAAIOCB (just make some letters lower case: DMAIocb?).
Everyone will use it anyway, right?

> > I saw devices do stl_le_phys and such, these
> > might need to be wrapped as well.
> 
> stl_le_phys() is defined and used only by hw/eepro100.c. That's already
> dealt with by converting the device.
> 

I see. Need to get around to adding some prefix to it to make this clear.

> 	Thanks,
> 	Eduard
> 
> > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > 
> > ---
> > 
> > diff --git a/hw/dma_rw.h b/hw/dma_rw.h
> > new file mode 100644
> > index 0000000..d63fd17
> > --- /dev/null
> > +++ b/hw/dma_rw.h
> > @@ -0,0 +1,122 @@
> > +#ifndef DMA_RW_H
> > +#define DMA_RW_H
> > +
> > +#include "qemu-common.h"
> > +
> > +/* We currently only have pci mmus, but using
> > +   a generic type makes it possible to use this
> > +   e.g. from the generic ide code without callbacks. */
> > +typedef uint64_t dma_addr_t;
> > +
> > +typedef struct DMAMmu DMAMmu;
> > +typedef struct DMADevice DMADevice;
> > +
> > +typedef int DMATranslateFunc(DMAMmu *mmu,
> > +                             DMADevice *dev,
> > +                             dma_addr_t addr,
> > +                             dma_addr_t *paddr,
> > +                             dma_addr_t *len,
> > +                             int is_write);
> > +
> > +typedef int DMAInvalidateMapFunc(DMADevice *);
> > +struct DMAMmu {
> > +	/* invalidate, etc. */
> > +	DmaTranslateFunc *translate;
> > +};
> > +
> > +struct DMADevice {
> > +	DMAMmu *mmu;
> > +	DMAInvalidateMapFunc *invalidate;
> > +};
> > +
> > +void dma_device_init(DMADevice *, DMAMmu *, DMAInvalidateMapFunc *);
> > +
> > +static inline void dma_memory_rw(DMADevice *dev,
> > +				 dma_addr_t addr,
> > +				 void *buf,
> > +				 uint32_t len,
> > +				 int is_write)
> > +{
> > +    uint32_t plen;
> > +    /* Fast-path non-iommu.
> > +     * More importantly, makes it obvious what this function does. */
> > +    if (!dev->mmu) {
> > +    	cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > +    	return;
> > +    }
> > +    while (len) {
> > +        err = dev->mmu->translate(iommu, dev, addr, &paddr, &plen, is_write);
> > +        if (err) {
> > +            return;
> > +        }
> > +                                      
> > +        /* The translation might be valid for larger regions. */
> > +        if (plen > len) {
> > +            plen = len;
> > +        }
> > +    
> > +        cpu_physical_memory_rw(paddr, buf, plen, is_write);
> > +
> > +        len -= plen;
> > +        addr += plen;
> > +        buf += plen;
> > +    }
> > +}
> > +
> > +void *dma_memory_map(DMADevice *dev,
> > +                            dma_addr_t addr,
> > +                            uint32_t *len,
> > +                            int is_write);
> > +void dma_memory_unmap(DMADevice *dev,
> > +		      void *buffer,
> > +		      uint32_t len,
> > +		      int is_write,
> > +		      uint32_t access_len);
> > +
> > +
> > ++#define DEFINE_DMA_LD(suffix, size)                                       \
> > ++uint##size##_t dma_ld##suffix(DMADevice *dev, dma_addr_t addr)            \
> > ++{                                                                         \
> > ++    int err;                                                              \
> > ++    target_phys_addr_t paddr, plen;                                       \
> > ++    if (!dev->mmu) {                                                      \
> > ++        return ld##suffix##_phys(addr, val);                              \
> > ++    }                                                                     \
> > ++                                                                          \
> > ++    err = dev->mmu->translate(dev->bus->iommu, dev,                       \
> > ++                              addr, &paddr, &plen, IOMMU_PERM_READ);      \
> > ++    if (err || (plen < size / 8))                                         \
> > ++        return 0;                                                         \
> > ++                                                                          \
> > ++    return ld##suffix##_phys(paddr);                                      \
> > ++}
> > ++
> > ++#define DEFINE_DMA_ST(suffix, size)                                       \
> > ++void dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val)  \
> > ++{                                                                         \
> > ++    int err;                                                              \
> > ++    target_phys_addr_t paddr, plen;                                       \
> > ++                                                                          \
> > ++    if (!dev->mmu) {                                                      \
> > ++        st##suffix##_phys(addr, val);                                     \
> > ++        return;                                                           \
> > ++    }                                                                     \
> > ++    err = dev->mmu->translate(dev->bus->iommu, dev,                       \
> > ++                              addr, &paddr, &plen, IOMMU_PERM_WRITE);     \
> > ++    if (err || (plen < size / 8))                                         \
> > ++        return;                                                           \
> > ++                                                                          \
> > ++    st##suffix##_phys(paddr, val);                                        \
> > ++}
> > +
> > +DEFINE_DMA_LD(ub, 8)
> > +DEFINE_DMA_LD(uw, 16)
> > +DEFINE_DMA_LD(l, 32)
> > +DEFINE_DMA_LD(q, 64)
> > +
> > +DEFINE_DMA_ST(b, 8)
> > +DEFINE_DMA_ST(w, 16)
> > +DEFINE_DMA_ST(l, 32)
> > +DEFINE_DMA_ST(q, 64)
> > +
> > +#endif
> > diff --git a/hw/pci.h b/hw/pci.h
> > index 1c6075e..9737f0e 100644
> > --- a/hw/pci.h
> > +++ b/hw/pci.h
> > @@ -5,6 +5,7 @@
> >  #include "qobject.h"
> >  
> >  #include "qdev.h"
> > +#include "dma_rw.h"
> >  
> >  /* PCI includes legacy ISA access.  */
> >  #include "isa.h"
> > @@ -119,6 +120,10 @@ enum {
> >  
> >  struct PCIDevice {
> >      DeviceState qdev;
> > +
> > +    /* For devices that do DMA. */
> > +    DMADevice dma;
> > +
> >      /* PCI config space */
> >      uint8_t *config;
> >  
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux