On Sat, 2012-05-12 at 10:27 +1000, Alexey Kardashevskiy wrote: > 12.05.2012 5:20, Jason Baron написал: > > On Fri, May 11, 2012 at 04:45:21PM +1000, Alexey Kardashevskiy wrote: > >> Normally the pci_add_capability is called on devices to add new > >> capability. This is ok for emulated devices which capabilities list > >> is being built by QEMU. > >> > >> In the case of VFIO the capability may already exist and adding new > >> capability into the beginning of the linked list may create a loop. > > > > Hi, > > > > I don't quite understand how we get a loop, if 'offset' is supplied to > > 'pci_add_capability' and there is an overlap we get -EINVAL. Otherwise, > > we are adding the capability in a new empty space. So, I see how we > > could get the capability in the list twice, but not how there is a loop. > > what am I missing? > > > This happens only with VFIO. > > The capability already exists in the config space as it is fetched from > the host kernel _before_ msi_init is called. Furthermore, msi_init() is > called when VFIO sees this capability in the config space. > > We probably want to re-add all capabilities, do not know... Yep, I've had a msi[1] and msix[2] patches in my vfio tree for a long time, we really want to support this generically for all capabilities though. We either need to detect or allow the caller to specify that the config space is already programmed. Note that even if we don't create a loop, particularly finicky drivers may balk at just changing the order of the capabilities list. Thanks, Alex [1]https://github.com/awilliam/qemu-vfio/commit/a9f04351610ab69e22d90a76dc85be3269000a9f [2]https://github.com/awilliam/qemu-vfio/commit/b4de3d0436b0260fbc6fcd40787c1c92ffca2980 > >> > >> For example, the old code destroys the following config > >> of PCIe Intel E1000E: > >> > >> before adding PCI_CAP_ID_MSI (0x05): > >> 0x34: 0xC8 > >> 0xC8: 0x01 0xD0 > >> 0xD0: 0x05 0xE0 > >> 0xE0: 0x10 0x00 > >> > >> after: > >> 0x34: 0xD0 > >> 0xC8: 0x01 0xD0 > >> 0xD0: 0x05 0xC8 > >> 0xE0: 0x10 0x00 > >> > >> As result capabilities 0x01 and 0x05 point to each other. > >> > >> The proposed patch does not change capability pointers when > >> the same type capability is about to add. > >> > >> Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxxxx> > >> --- > >> hw/pci.c | 10 ++++++---- > >> 1 files changed, 6 insertions(+), 4 deletions(-) > >> > >> diff --git a/hw/pci.c b/hw/pci.c > >> index aa0c0b8..1f7c924 100644 > >> --- a/hw/pci.c > >> +++ b/hw/pci.c > >> @@ -1794,10 +1794,12 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id, > >> } > >> > >> config = pdev->config + offset; > >> - config[PCI_CAP_LIST_ID] = cap_id; > >> - config[PCI_CAP_LIST_NEXT] = pdev->config[PCI_CAPABILITY_LIST]; > >> - pdev->config[PCI_CAPABILITY_LIST] = offset; > >> - pdev->config[PCI_STATUS] |= PCI_STATUS_CAP_LIST; > >> + if (config[PCI_CAP_LIST_ID] != cap_id) { > >> + config[PCI_CAP_LIST_ID] = cap_id; > >> + config[PCI_CAP_LIST_NEXT] = pdev->config[PCI_CAPABILITY_LIST]; > >> + pdev->config[PCI_CAPABILITY_LIST] = offset; > >> + pdev->config[PCI_STATUS] |= PCI_STATUS_CAP_LIST; > >> + } > >> memset(pdev->used + offset, 0xFF, size); > >> /* Make capability read-only by default */ > >> memset(pdev->wmask + offset, 0, size); > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html