On 12/05/12 00:13, Alexander Graf wrote: > > On 11.05.2012, at 14:47, Alexey Kardashevskiy wrote: > >> 11.05.2012 20:52, Alexander Graf написал: >>> >>> On 11.05.2012, at 08:45, Alexey Kardashevskiy wrote: >>> >>>> Normally the pci_add_capability is called on devices to add new >>>> capability. This is ok for emulated devices which capabilities list >>>> is being built by QEMU. >>>> >>>> In the case of VFIO the capability may already exist and adding new >>>> capability into the beginning of the linked list may create a loop. >>>> >>>> For example, the old code destroys the following config >>>> of PCIe Intel E1000E: >>>> >>>> before adding PCI_CAP_ID_MSI (0x05): >>>> 0x34: 0xC8 >>>> 0xC8: 0x01 0xD0 >>>> 0xD0: 0x05 0xE0 >>>> 0xE0: 0x10 0x00 >>>> >>>> after: >>>> 0x34: 0xD0 >>>> 0xC8: 0x01 0xD0 >>>> 0xD0: 0x05 0xC8 >>>> 0xE0: 0x10 0x00 >>>> >>>> As result capabilities 0x01 and 0x05 point to each other. >>>> >>>> The proposed patch does not change capability pointers when >>>> the same type capability is about to add. >>>> >>>> Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxxxx> >>>> --- >>>> hw/pci.c | 10 ++++++---- >>>> 1 files changed, 6 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/hw/pci.c b/hw/pci.c >>>> index aa0c0b8..1f7c924 100644 >>>> --- a/hw/pci.c >>>> +++ b/hw/pci.c >>>> @@ -1794,10 +1794,12 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id, >>>> } >>>> >>>> config = pdev->config + offset; >>>> - config[PCI_CAP_LIST_ID] = cap_id; >>>> - config[PCI_CAP_LIST_NEXT] = pdev->config[PCI_CAPABILITY_LIST]; >>>> - pdev->config[PCI_CAPABILITY_LIST] = offset; >>>> - pdev->config[PCI_STATUS] |= PCI_STATUS_CAP_LIST; >>>> + if (config[PCI_CAP_LIST_ID] != cap_id) { >>> >>> This doesn't scale. Capabilities are a list of CAPs. You'll have to do a loop through all capabilities, check if the one you want to add is there already and if so either >>> * replace the existing one or >>> * drop out and not write the new one in. > > * hw_error :) > >>> >>> I'm not sure which way would be more natural. >> >> There is a third option - add another function, lets call it >> pci_fixup_capability() which would do whatever pci_add_capability() does >> but won't touch list pointers. > > What good is a function that breaks internal consistency? It is broken already by having PCIDevice.used field. Normally pci_add_capability() would go through the whole list and add a capability if it does not exist. Emulated devices which care about having a capability at some fixed offset would have initialized their config space before calling this capabilities API (as VFIO does). If we really want to support emulated devices which want some capabilities be at fixed offset and others at random offsets (strange, but ok), I do not see how it is bad to restore this consistency by special function (pci_fixup_capability()) to avoid its rewriting at different location as a guest driver may care about its offset. >> When vfio, pci_add_capability() is called from the code which knows >> exactly that the capability exists and where it is and it calls >> pci_add_capability() based on this knowledge so doing additional loops >> just for imaginery scalability is a bit weird, no? > > Not sure I understand your proposal. The more generic a framework is, the better, no? In this code path we don't care about speed. We only care about consistency and reliability. > > > Alex > -- Alexey -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html