On 07/15/2010 11:45 AM, Eduard - Gabriel Munteanu wrote:
On Thu, Jul 15, 2010 at 07:45:06AM -0500, Anthony Liguori wrote:
No. PCI devices should never call cpu_physical_memory*.
PCI devices should call pci_memory*.
ISA devices should call isa_memory*.
All device memory accesses should go through their respective buses.
There can be multiple IOMMUs at different levels of the device
hierarchy. If you don't provide bus-level memory access functions that
chain through the hierarchy, it's extremely difficult to implement all
the necessary hooks to perform the translations at different places.
Regards,
Anthony Liguori
I liked Paul's initial approach more, at least if I understood him
correctly. Basically I'm suggesting a single memory_* function that
simply asks the bus for I/O and translation. Say you have something like
this:
+ Bus 1
|
---- Memory 1
|
---+ Bus 2 bridge
|
---- Memory 2
|
---+ Bus 3 bridge
|
---- Device
Say Device wants to write to memory. If we have the DeviceState we
needn't concern whether this is a BusOneDevice or BusTwoDevice from
device code itself. We would just call
memory_rw(dev_state, addr, buf, size, is_write);
I dislike this API for a few reasons:
1) buses have different types of addresses with different address
ranges. this api would have to take a generic address type.
2) dev_state would be the qdev device state. this means qdev needs to
have memory hook mechanisms that's chainable. I think it's unnecessary
at the qdev level
3) users have upcasted device states, so it's more natural to pass
PCIDevice than DeviceState.
4) there's an assumption that all devices can get to DeviceState.
that's not always true today.
which simply recurses through DeviceState's and BusState's through their
parent pointers. The actual bus can set up those to provide
identification information and perhaps hooks for translation and access
checking. So memory_rw() looks like this (pseudocode):
static void memory_rw(DeviceState *dev,
target_phys_addr_t addr,
uint8_t *buf,
int size,
int is_write)
{
BusState *bus = get_parent_bus_of_dev(dev);
DeviceState *pdev = get_parent_dev(dev);
target_phys_addr_t taddr;
if (!bus) {
/* This shouldn't happen. */
assert(0);
}
if (bus->responsible_for(addr)) {
raw_physical_memory_rw(addr, buf, size, is_write);
return;
}
taddr = bus->translate(dev, addr);
memory_rw(pdev, taddr, buf, size, is_write);
This is too simplistic because you sometimes have layering that doesn't
fit into the bus model. For instance, virtio + pci.
We really want a virtio_memory_rw that calls either syborg_memory_rw or
pci_memory_rw based on the transport. In your proposal, we would have
to model virtio-pci as a bus with a single device which appears awkward
to me.
Regards,
Anthony Liguori
}
If we do this, it seems there's no need to provide separate
functions. The actual buses must instead initialize those hooks
properly. Translation here is something inherent to the bus, that
handles arbitration between possibly multiple IOMMUs. Our memory would
normally reside on / belong to the top-level bus.
What do you think? (Naming could be better though.)
Eduard
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html