On Tuesday 23 November 2010 15:54:40 Avi Kivity wrote: > On 11/23/2010 08:35 AM, Yang, Sheng wrote: > > On Tuesday 23 November 2010 14:17:28 Avi Kivity wrote: > > > On 11/23/2010 08:09 AM, Yang, Sheng wrote: > > > > Hi Avi, > > > > > > > > I've purposed the following API for mask bit support. > > > > > > > > The main point is, QEmu can know which entries are enabled(by > > > > pci_enable_msix()). And for enabled entries, kernel own it, > > > > including MSI data/address and mask bit(routing table and mask > > > > bitmap). QEmu should use KVM_GET_MSIX_ENTRY ioctl to get them(and > > > > it can sync with them if it want to do so). > > > > > > > > Before entries are enabled, QEmu can still use it's own MSI > > > > table(because we didn't contain these kind of information in > > > > kernel, and it's unnecessary for kernel). > > > > > > > > The KVM_MSIX_FLAG_ENTRY flag would be clear if QEmu want to query > > > > one entry didn't exist in kernel - or we can simply return -EINVAL > > > > for it. > > > > > > > > I suppose it would be rare for QEmu to use this interface to get > > > > the context of entry(the only case I think is when MSI-X disable > > > > and QEmu need to sync the context), so performance should not be > > > > an issue. > > > > > > > > What's your opinion? > > > > > > > > > #define KVM_GET_MSIX_ENTRY _IOWR(KVMIO, 0x7d, struct > > > > > kvm_msix_entry) > > > > > > Need SET_MSIX_ENTRY for live migration as well. > > > > Current we don't support LM with VT-d... > > Isn't this work useful for virtio as well? Yeah, but won't be included in this patchset. > > > > > > #define KVM_UPDATE_MSIX_MMIO _IOW(KVMIO, 0x7e, struct > > > > > kvm_msix_mmio) > > > > > > > > > > #define KVM_MSIX_TYPE_ASSIGNED_DEV 1 > > > > > > > > > > #define KVM_MSIX_FLAG_MASKBIT (1<< 0) > > > > > #define KVM_MSIX_FLAG_QUERY_MASKBIT (1<< 0) > > > > > #define KVM_MSIX_FLAG_ENTRY (1<< 1) > > > > > #define KVM_MSIX_FLAG_QUERY_ENTRY (1<< 1) > > > > > > Why is there a need for the flag? If we simply get/set entire > > > entries, that includes the mask bits? > > > > We still want QEmu to cover a part of entries which hasn't been enabled > > yet(which won't existed in routing table), but kernel would cover all > > mask bit regardless of if it's enabled. So QEmu can query any entry to > > check the maskbit, but not address/data. > > Don't understand. If we support reading/writing entire entries, that > works for both enabled and disabled entries? > > > > What about the pending bits? > > > > We didn't cover it here - and it's in another MMIO space(PBA). Of course > > we can add more flags for it later. > > When an entry is masked, we need to set the pending bit for it > somewhere. I guess this is broken in the existing code (without your > patches)? Even with my patch, we didn't support the pending bit. It would always return 0 now. What we supposed to do(after my patch checked in) is to check IRQ_PENDING flag of irq_desc->status(if the entry is masked), and return the result to userspace. That would involve some core change, like to export irq_to_desc(). I don't think it would be accepted soon, so would push mask bit first. > > > > Also need a new exit reason to tell userspace that an msix entry has > > > changed, so userspace can update mappings. > > > > I think we don't need it. Whenever userspace want to get one mapping > > which is an enabled MSI-X entry, it can check it with the API > > above(which is quite rare, because kernel would handle all of them when > > guest is accessing them). If it's a disabled entry, the context inside > > userspace MMIO record is the correct one(and only one). The only place I > > think QEmu need to sync is when MSI-X is about to disabled, QEmu need to > > update it's own MMIO record. > > So in-kernel handling of mmio would be decided per entry? I'm trying to > simplify this, and simplest thing is - all or nothing. So you would like to handle all MSI-X MMIO in kernel? -- regards Yang, Sheng -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html