Am 09.11.2010 14:27, Avi Kivity wrote: > On 11/08/2010 01:21 PM, Jan Kiszka wrote: >> PCI 2.3 allows to generically disable IRQ sources at device level. This >> enables us to share IRQs of such devices between on the host side when >> passing them to a guest. This feature is optional, user space has to >> request it explicitly. Moreover, user space can inform us about its view >> of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the interrupt >> and signaling it if the guest masked it via the PCI config space. >> > > It's a pity this cannot be done transparently. We could detect multiple > devices sharing the line, Even that is not possible. Assigned or host devices may be activated after we registered exclusively, pushing the breakage from VM start-up to a different operation. > but what about PCI_COMMAND_INTX_DISABLE? > > Perhaps we can hook the kernel's handler for this bit? Some IRQ registration notifier that would allow us to reregister our handler with IRQ sharing support? Maybe. > >> >> /* Depends on KVM_CAP_IOMMU */ >> #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1<< 0) >> +/* The following two depend on KVM_CAP_PCI_2_3 */ >> +#define KVM_DEV_ASSIGN_PCI_2_3 (1<< 1) >> +#define KVM_DEV_ASSIGN_MASK_INTX (1<< 2) >> + >> +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx interrupts >> +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with other >> +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the >> +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details. >> >> 4.48 KVM_DEASSIGN_PCI_DEVICE >> >> @@ -1263,6 +1271,23 @@ struct kvm_assigned_msix_entry { >> __u16 padding[3]; >> }; >> >> +5.54 KVM_ASSIGN_SET_INTX_MASK > > 4.54? Of course. > > (54? wow.) And I don't think all IOCTLs are already documented (though the majority now). > >> + >> +Capability: KVM_CAP_PCI_2_3 >> +Architectures: x86 >> +Type: vm ioctl >> +Parameters: struct kvm_assigned_pci_dev (in) >> +Returns: 0 on success, -1 on error >> + >> +Informs the kernel about the guest's view on the INTx mask. As long as the >> +guest masks the legacy INTx, the kernel will refrain from unmasking it at >> +hardware level and will not assert the guest's IRQ line. User space is still >> +responsible for applying this state to the assigned device's real config space. > > What if userspace lies? User space problem. We will at worst receive one IRQ, mask it, and then user space need to react again. > >> + >> +See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified >> +by assigned_dev_id. In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is >> +evaluated. >> + >> >> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h >> index fe83eb0..7f1627c 100644 >> --- a/include/linux/kvm_host.h >> +++ b/include/linux/kvm_host.h >> @@ -468,6 +468,7 @@ struct kvm_assigned_dev_kernel { >> unsigned int entries_nr; >> int host_irq; >> bool host_irq_disabled; >> + bool pci_2_3; >> struct msix_entry *host_msix_entries; >> int guest_irq; >> struct msix_entry *guest_msix_entries; >> @@ -477,6 +478,7 @@ struct kvm_assigned_dev_kernel { >> struct pci_dev *dev; >> struct kvm *kvm; >> spinlock_t intx_lock; >> + struct mutex intx_mask_lock; >> char irq_name[32]; >> }; > > I saw no reason this can't be a spinlock, but perhaps I missed > something. This would allow us to avoid srcu, which is slightly more > expensive than rcu. Since pci 2.3 assigned devices are not a major use > case, I'd like not to penalize the mainstream users for this. The lock has to be held across kvm_set_irq, which is the potentially expensive (O(n), n == number of VCPUs) operation. > > This patch undoes some of the niceness of the previous patches, but I > have no alternative to suggest. > Yes, it surely does not make things simpler. But much of the complexity is avoided during runtime when MSIs are used. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html