On Mon, Jan 30, 2023 at 7:54 PM Eli Cohen <elic@xxxxxxxxxx> wrote: > > > On 30/01/2023 13:34, Michael S. Tsirkin wrote: > > On Mon, Jan 30, 2023 at 12:01:23PM +0200, Eli Cohen wrote: > >> On 30/01/2023 10:19, Jason Wang wrote: > >>> Hi Eli: > >>> > >>> On Mon, Jan 23, 2023 at 1:59 PM Eli Cohen <elic@xxxxxxxxxx> wrote: > >>>> VDPA allows hardware drivers the propagate interrupts from the hardware > >>>> directly to the vCPU used by the guest. In a typical implementation, the > >>>> hardware driver will assign the interrupt vectors to the virtqueues and report > >>>> this information back through the get_vq_irq() callback defined in > >>>> struct vdpa_config_ops. > >>>> > >>>> Interrupt vectors could be a scarce resource and may be limited. For such > >>>> cases, we can opt the administrator, through the vdpa tool, to set the policy > >>>> defining how to distribute the available vectors amongst the data virtqueues. > >>>> > >>>> The following policies are proposed: > >>>> > >>>> 1. First comes first served. Assign a vector to each data virtqueue by the > >>>> virtqueue index. Virtqueues which could not be assigned a dedicated vector > >>>> would use the hardware driver to propagate interrupts using the available > >>>> callback mechanism. > >>>> > >>>> vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 int=all > >>>> > >>>> This is the default mode and works even if "int=all" was not specified. > >>>> > >>>> 2. Use round robin distribution so virtqueues could share vectors. > >>>> vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 int=all intmode=share > >>>> > >>>> 3. Assign vectors to RX virtqueues only. > >>>> 3.1 Do not share vectors > >>>> vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 int=rx > >>>> 3.2 Share vectors > >>>> vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 int=rx intmode=share > >>>> > >>>> 4. Assign vectors to TX virtqueues only. Can share or not, like rx. > >>>> 5. Fail device creation if number of vectors cannot be fulfilled. > >>>> vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 max_vq_pairs 8 int=rx intnum=8 > >>> I wonder: > >>> > >>> 1) how the administrator can know if there's sufficient resources for > >>> one of the above policies. > >> There's no established way to know. The idea is to use whatever there is > >> assuming interrupt bypassing is always better then the callback mechanism. > >>> 2) how does the administrator know which policy is the best assuming > >>> the resources are sufficient? (E.g vectors to RX only or vectors to TX > >>> only) > >> I don't think there's a rule of thumb here but he needs to experiment what > >> works best for him. > >>> If it requires a vendor specific way or knowledge, I believe it's > >>> better to code them in: > >>> > >>> 1) the vDPA parent or > >>> 2) underlayer management tool or drivers > >>> > >>> Thanks > >> I was wondering also about the current mechanism we have. The hardware > >> driver reports irq number for each VQ. > >> > >> The guest driver sees a virtio pci device with MSIX vectors as the number of > >> virtqueues. > >> > >> Suppose the hardware driver provided only 5 interrupt vectors while there > >> are 16 VQs. > >> > >> Which MSIX vector at the guest gets really posted interrupt and which one > >> uses callback handled at the hardware driver? > > Not sure I understand. > > If you get a single interrupt from hardware callback or posted > > you can only drive one interrupt to guest, no? > > > For every VQ I have a chance to assign interrupt vector. > > Consider this scenario: > > mlx5_vdpa created with 16 data virtqueu > > mlx5_vdpa associates VQ0 with interrupt vector. The reset of the vectors > don't get assigned vectors and use old callback mechanism. > > When you go to the VM and run lspci, you will see the device has 16 MSIX > vectors. Note that the guest MSI-X vectors are emulated by software, you can change by specificing "vectors=X" parameters of virtio-pci. And those MSI-X vectors are backed by eventfds which Qemu will create and pass to both KVM and vhost-vDPA. > > Do you know which of the MSIX vectors on the guest is the vector I > assigned for VQ0? The mapping from guest MSI-X vector to VQ0 is done via queue_msix_vector in the capability, and it is under the control of guest virtio-pci drivers. The mapping from host MSI-X to guest MSI-X (required for the posted interrupt) is done via matching the eventfd between KVM and vhost-vDPA when assigning eventfds. So assuming: 1) guest driver use guest seen MSI-X vector X for vq0 2) host driver report irqX via get_vq_irq(0) Then corresponding host MSI-X of irqX is mapped to vq0 (via guest seen MSI-X vector X) via posted interrupt when it is possible. If the posted interrupt can't work for some reasons, the code will fallback to vq_callback which is a simple eventfd_signal(). Thanks > > >>>> > >>>> > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization