Hi Punit, Thanks for the review On 09/03/18 16:53, Punit Agrawal wrote: >> +static int vfio_pci_disable_msis(struct kvm *kvm, struct vfio_device *vdev) >> +{ >> + int ret; >> + struct vfio_pci_device *pdev = &vdev->pci; >> + struct vfio_pci_msi_common *msis = &pdev->msix; >> + struct vfio_irq_set irq_set = { >> + .argsz = sizeof(irq_set), >> + .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER, >> + .index = msis->info.index, >> + .start = 0, >> + .count = 0, >> + }; >> + >> + if (!msi_is_enabled(msis->phys_state) || >> + msi_is_enabled(msis->virt_state)) > > Is the check for virt_state really needed? It's not clear why it matters > - please add a comment to save some future head scratching. No, it doesn't seem like we can reach this if virt_state is enabled. >> + return 0; >> + >> + ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); >> + if (ret < 0) { >> + perror("VFIO_DEVICE_SET_IRQS(NONE)"); >> + return ret; >> + } >> + >> + msi_set_enabled(msis->phys_state, false); >> + msi_set_empty(msis->phys_state, true); >> + >> + return 0; >> +} >> + > > [...] > >> +static void vfio_pci_msix_table_access(struct kvm_cpu *vcpu, u64 addr, u8 *data, >> + u32 len, u8 is_write, void *ptr) >> +{ >> + struct kvm *kvm = vcpu->kvm; >> + struct vfio_pci_msi_entry *entry; >> + struct vfio_pci_device *pdev = ptr; >> + struct vfio_device *vdev = container_of(pdev, struct vfio_device, pci); >> + >> + u64 offset = addr - pdev->msix_table.guest_phys_addr; >> + >> + size_t vector = offset / PCI_MSIX_ENTRY_SIZE; >> + /* PCI spec says that software must use aligned 4 or 8 bytes accesses */ >> + off_t field = offset % PCI_MSIX_ENTRY_SIZE; > > Does the 4/8 byte access assumption be checked? Or is this something > enforced before we get here. It's specific to MSI-X tables, so it would need to be checked here. The spec (6.8.2) says "undefined behaviour", so we're free to do whatever we want. I suppose we can return here > >> + entry = &pdev->msix.entries[vector]; >> + >> + mutex_lock(&pdev->msix.mutex); >> + >> + if (!is_write) { >> + memcpy(data, (void *)&entry->config + field, len); >> + goto out_unlock; >> + } >> + >> + memcpy((void *)&entry->config + field, data, len); >> + >> + if (field != PCI_MSIX_ENTRY_VECTOR_CTRL) >> + goto out_unlock; > > Could an 8 byte access modify the vector control field? Yes (explicitly allowed in "Special Considerations for QWORD Accesses", even.) I'll refine this check. >> >> +static int vfio_pci_create_msix_table(struct kvm *kvm, >> + struct vfio_pci_device *pdev) >> +{ >> + int ret; >> + size_t i; >> + size_t nr_entries; >> + size_t table_size; >> + struct vfio_pci_msi_entry *entries; >> + struct vfio_pci_msix_pba *pba = &pdev->msix_pba; >> + struct vfio_pci_msix_table *table = &pdev->msix_table; >> + struct msix_cap *msix = PCI_CAP(&pdev->hdr, pdev->msix.pos); >> + >> + table->bar = msix->table_offset & PCI_MSIX_TABLE_BIR; >> + pba->bar = msix->pba_offset & PCI_MSIX_TABLE_BIR; >> + >> + /* >> + * KVM needs memory regions to be multiple of and aligned on PAGE_SIZE. >> + */ >> + nr_entries = (msix->ctrl & PCI_MSIX_FLAGS_QSIZE) + 1; >> + table_size = ALIGN(nr_entries * PCI_MSIX_ENTRY_SIZE, PAGE_SIZE); >> + >> + entries = calloc(nr_entries, sizeof(struct vfio_pci_msi_entry)); >> + if (!entries) >> + return -ENOMEM; >> + >> + for (i = 0; i < nr_entries; i++) >> + entries[i].config.ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT; >> + >> + /* >> + * To ease MSI-X cap configuration in case they share the same BAR, >> + * collapse table and pending array. According to PCI, address spaces >> + * must be power of two. Since nr_entries is a power of two, and PBA >> + * size is less than table_size, reserve 2*table_size. >> + */ > > As I understand it, nr_entries is between 0x0 < nr_entries <= 0x800 but > does not have to be a power of two. Also, it's not clear why "address > spaces must be a power of two" is relevant. What am I missing? Uh right, MSI-X doesn't put a restriction on nr_entries. I probably confused it with the MSI restriction. If I understood 6.2.5.1 correctly in the PCI spec, the size of the area pointed to by the bar (oddly referred to as "address space"?) must be a power of two, hence the 2*table_size allocation. But since nr_entries is not a power of two, I need to change the table_size calculation. Looks like I need to add fls() to kvmtool. Thanks, Jean