Hi Andre, On Tue, 31 Aug 2021 12:10:35 +0100, Andre Przywara <andre.przywara@xxxxxxx> wrote: > > On Fri, 27 Aug 2021 12:54:05 +0100 > Marc Zyngier <maz@xxxxxxxxxx> wrote: > > Hi Marc, > > > Since 45d3b59e8c45 ("kvm tools: Increase amount of possible interrupts > > per PCI device"), the number of MSI-S has gone from 4 to 33. > > > > However, the corresponding storage hasn't been upgraded, and writing > > to the MSI-X table is a pretty risky business. Now that the Linux > > kernel writes to *all* MSI-X entries before doing anything else > > with the device, kvmtool dies a horrible death. > > > > Fix it by properly defining the size of the MSI-X bar, and make > > Linux great again. > > > > This includes some fixes the PBA region decoding, as well as minor > > cleanups to make this code a bit more maintainable. > > > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > > Many thanks for fixing this, it looks good to me now. Just some > questions below: > > > --- > > virtio/pci.c | 42 ++++++++++++++++++++++++++++++------------ > > 1 file changed, 30 insertions(+), 12 deletions(-) > > > > diff --git a/virtio/pci.c b/virtio/pci.c > > index eb91f512..41085291 100644 > > --- a/virtio/pci.c > > +++ b/virtio/pci.c > > @@ -7,6 +7,7 @@ > > #include "kvm/irq.h" > > #include "kvm/virtio.h" > > #include "kvm/ioeventfd.h" > > +#include "kvm/util.h" > > > > #include <sys/ioctl.h> > > #include <linux/virtio_pci.h> > > @@ -14,6 +15,13 @@ > > #include <assert.h> > > #include <string.h> > > > > +#define ALIGN_UP(x, s) ALIGN((x) + (s) - 1, (s)) > > +#define VIRTIO_NR_MSIX (VIRTIO_PCI_MAX_VQ + VIRTIO_PCI_MAX_CONFIG) > > +#define VIRTIO_MSIX_TABLE_SIZE (VIRTIO_NR_MSIX * 16) > > +#define VIRTIO_MSIX_PBA_SIZE (ALIGN_UP(VIRTIO_MSIX_TABLE_SIZE, 64) / 8) > > +#define VIRTIO_MSIX_BAR_SIZE (1UL << fls_long(VIRTIO_MSIX_TABLE_SIZE + \ > > + VIRTIO_MSIX_PBA_SIZE)) > > + > > static u16 virtio_pci__port_addr(struct virtio_pci *vpci) > > { > > return pci__bar_address(&vpci->pci_hdr, 0); > > @@ -333,18 +341,27 @@ static void virtio_pci__msix_mmio_callback(struct kvm_cpu *vcpu, > > struct virtio_pci *vpci = vdev->virtio; > > struct msix_table *table; > > u32 msix_io_addr = virtio_pci__msix_io_addr(vpci); > > + u32 pba_offset; > > int vecnum; > > size_t offset; > > > > - if (addr > msix_io_addr + PCI_IO_SIZE) { > > Ouch, the missing "=" looks like another long standing bug you fixed, I > wonder how this ever worked before? Looking deeper it looks like the > whole PBA code was quite broken (allowing writes, for instance, and > mixing with the code for the MSIX table)? I don't think it ever worked. And to be fair, no known guest ever reads from it either. It just that as I was reworking it, some of the pitfalls became obvious. > > > + BUILD_BUG_ON(VIRTIO_NR_MSIX > (sizeof(vpci->msix_pba) * 8)); > > + > > + pba_offset = vpci->pci_hdr.msix.pba_offset & ~PCI_MSIX_TABLE_BIR; > > Any particular reason you read back the offset from the MSIX capability > instead of just using VIRTIO_MSIX_TABLE_SIZE here? Is that to avoid > accidentally diverging in the future, by having just one place of > definition? Exactly. My first version of this patch actually failed to update the offset advertised to the guest, so I decided to just have a single location for this. At least, we won't have to touch this code again if we change the number of MSI-X. > > > + if (addr >= msix_io_addr + pba_offset) { > > + /* Read access to PBA */ > > if (is_write) > > return; > > - table = (struct msix_table *)&vpci->msix_pba; > > - offset = addr - (msix_io_addr + PCI_IO_SIZE); > > - } else { > > - table = vpci->msix_table; > > - offset = addr - msix_io_addr; > > + offset = addr - (msix_io_addr + pba_offset); > > + if ((offset + len) > sizeof (vpci->msix_pba)) > > + return; > > + memcpy(data, (void *)&vpci->msix_pba + offset, len); > > Should this be a char* cast, since pointer arithmetic on void* is > somewhat frowned upon (aka "forbidden in the C standard, but allowed as > a GCC extension")? I am trying to be consistent. A quick grep shows at least 19 occurrences of pointer arithmetic with '(void *)', and none with '(char *)'. Happy for someone to go and repaint this, but I don't think this should be the purpose of this patch. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm