On Tuesday 23 November 2010 20:47:33 Avi Kivity wrote: > On 11/23/2010 10:30 AM, Yang, Sheng wrote: > > On Tuesday 23 November 2010 15:54:40 Avi Kivity wrote: > > > On 11/23/2010 08:35 AM, Yang, Sheng wrote: > > > > On Tuesday 23 November 2010 14:17:28 Avi Kivity wrote: > > > > > On 11/23/2010 08:09 AM, Yang, Sheng wrote: > > > > > > Hi Avi, > > > > > > > > > > > > I've purposed the following API for mask bit support. > > > > > > > > > > > > The main point is, QEmu can know which entries are > > > > > > enabled(by pci_enable_msix()). And for enabled entries, > > > > > > kernel own it, including MSI data/address and mask > > > > > > bit(routing table and mask bitmap). QEmu should use > > > > > > KVM_GET_MSIX_ENTRY ioctl to get them(and it can sync with > > > > > > them if it want to do so). > > > > > > > > > > > > Before entries are enabled, QEmu can still use it's own MSI > > > > > > table(because we didn't contain these kind of information > > > > > > in kernel, and it's unnecessary for kernel). > > > > > > > > > > > > The KVM_MSIX_FLAG_ENTRY flag would be clear if QEmu want to > > > > > > query one entry didn't exist in kernel - or we can simply > > > > > > return -EINVAL for it. > > > > > > > > > > > > I suppose it would be rare for QEmu to use this interface > > > > > > to get the context of entry(the only case I think is when > > > > > > MSI-X disable and QEmu need to sync the context), so > > > > > > performance should not be an issue. > > > > > > > > > > > > What's your opinion? > > > > > > > > > > > > > #define KVM_GET_MSIX_ENTRY _IOWR(KVMIO, 0x7d, > > > > > > > struct kvm_msix_entry) > > > > > > > > > > Need SET_MSIX_ENTRY for live migration as well. > > > > > > > > Current we don't support LM with VT-d... > > > > > > Isn't this work useful for virtio as well? > > > > Yeah, but won't be included in this patchset. > > What API changes are needed? I'd like to see the complete API. I am not sure about it. But I suppose the structure should be the same? In fact it's pretty hard for me to image what's needed for virtio in the future, especially there is no such code now. I really prefer to deal with assigned device and virtio separately, which would make the work much easier. But seems you won't agree on that. > > > > > > What about the pending bits? > > > > > > > > We didn't cover it here - and it's in another MMIO space(PBA). Of > > > > course we can add more flags for it later. > > > > > > When an entry is masked, we need to set the pending bit for it > > > somewhere. I guess this is broken in the existing code (without your > > > patches)? > > > > Even with my patch, we didn't support the pending bit. It would always > > return 0 now. What we supposed to do(after my patch checked in) is to > > check IRQ_PENDING flag of irq_desc->status(if the entry is masked), and > > return the result to userspace. > > > > That would involve some core change, like to export irq_to_desc(). I > > don't think it would be accepted soon, so would push mask bit first. > > The API needs to be compatible with the pending bit, even if we don't > implement it now. I want to reduce the rate of API changes. This can be implemented by this API, just adding a flag for it. And I would still take this into consideration in the next API purposal. > > > > > Also need a new exit reason to tell userspace that an msix > > > > > entry has changed, so userspace can update mappings. > > > > > > > > I think we don't need it. Whenever userspace want to get one > > > > mapping which is an enabled MSI-X entry, it can check it with the > > > > API above(which is quite rare, because kernel would handle all of > > > > them when guest is accessing them). If it's a disabled entry, the > > > > context inside userspace MMIO record is the correct one(and only > > > > one). The only place I think QEmu need to sync is when MSI-X is > > > > about to disabled, QEmu need to update it's own MMIO record. > > > > > > So in-kernel handling of mmio would be decided per entry? I'm trying > > > to simplify this, and simplest thing is - all or nothing. > > > > So you would like to handle all MSI-X MMIO in kernel? > > Yes. Writes to address or data would be handled by: > - recording it into the shadow msix table > - notifying userspace that msix entry x changed > Reads would be handled in kernel from the shadow msix table. > > So instead of > > - guest reads/writes msix > - kvm filters mmio, implements some, passes others to userspace > > we have > > - guest reads/writes msix > - kvm implements all > - some writes generate an additional notification to userspace I suppose we don't need to generate notification to userspace? Because every read/write is handled by kernel, and userspace just need interface to kernel to get/set the entry - and well, does userspace need to do it when kernel can handle all of them? Maybe not... -- regards Yang, Sheng -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html