Re: RFC: VDPA Interrupt vector distribution

"Michael S. Tsirkin" <mst@xxxxxxxxxx> · Tue, 31 Jan 2023 02:26:08 -0500

On Mon, Jan 30, 2023 at 01:54:14PM +0200, Eli Cohen wrote:
> 
> On 30/01/2023 13:34, Michael S. Tsirkin wrote:
> > On Mon, Jan 30, 2023 at 12:01:23PM +0200, Eli Cohen wrote:
> > > On 30/01/2023 10:19, Jason Wang wrote:
> > > > Hi Eli:
> > > > 
> > > > On Mon, Jan 23, 2023 at 1:59 PM Eli Cohen <elic@xxxxxxxxxx> wrote:
> > > > > VDPA allows hardware drivers the propagate interrupts from the hardware
> > > > > directly to the vCPU used by the guest. In a typical implementation, the
> > > > > hardware driver will assign the interrupt vectors to the virtqueues and report
> > > > > this information back through the get_vq_irq() callback defined in
> > > > > struct vdpa_config_ops.
> > > > > 
> > > > > Interrupt vectors could be a scarce resource and may be limited. For such
> > > > > cases, we can opt the administrator, through the vdpa tool, to set the policy
> > > > > defining how to distribute the available vectors amongst the data virtqueues.
> > > > > 
> > > > > The following policies are proposed:
> > > > > 
> > > > > 1. First comes first served. Assign a vector to each data virtqueue by the
> > > > >       virtqueue index. Virtqueues which could not be assigned a dedicated vector
> > > > >       would use the hardware driver to propagate interrupts using the available
> > > > >       callback mechanism.
> > > > > 
> > > > >       vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 int=all
> > > > > 
> > > > >       This is the default mode and works even if "int=all" was not specified.
> > > > > 
> > > > > 2. Use round robin distribution so virtqueues could share vectors.
> > > > >       vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 int=all intmode=share
> > > > > 
> > > > > 3. Assign vectors to RX virtqueues only.
> > > > > 3.1 Do not share vectors
> > > > >        vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 int=rx
> > > > > 3.2 Share vectors
> > > > >        vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 int=rx intmode=share
> > > > > 
> > > > > 4. Assign vectors to TX virtqueues only. Can share or not, like rx.
> > > > > 5. Fail device creation if number of vectors cannot be fulfilled.
> > > > >       vdpa dev add name vdpa0 mgmtdev pci/0000:86:00.2 max_vq_pairs 8 int=rx intnum=8
> > > > I wonder:
> > > > 
> > > > 1) how the administrator can know if there's sufficient resources for
> > > > one of the above policies.
> > > There's no established way to know. The idea is to use whatever there is
> > > assuming interrupt bypassing is always better then the callback mechanism.
> > > > 2) how does the administrator know which policy is the best assuming
> > > > the resources are sufficient? (E.g vectors to RX only or vectors to TX
> > > > only)
> > > I don't think there's a rule of thumb here but he needs to experiment what
> > > works best for him.
> > > > If it requires a vendor specific way or knowledge, I believe it's
> > > > better to code them in:
> > > > 
> > > > 1) the vDPA parent or
> > > > 2) underlayer management tool or drivers
> > > > 
> > > > Thanks
> > > I was wondering also about the current mechanism we have. The hardware
> > > driver reports irq number for each VQ.
> > > 
> > > The guest driver sees a virtio pci device with MSIX vectors as the number of
> > > virtqueues.
> > > 
> > > Suppose the hardware driver provided only 5 interrupt vectors while there
> > > are 16 VQs.
> > > 
> > > Which MSIX vector at the guest gets really posted interrupt and which one
> > > uses callback handled at the hardware driver?
> > Not sure I understand.
> > If you get a single interrupt from hardware callback or posted
> > you can only drive one interrupt to guest, no?
> > 
> For every VQ I have a chance to assign interrupt vector.
> 
> Consider this scenario:
> 
> mlx5_vdpa created with 16 data virtqueu
> 
> mlx5_vdpa associates VQ0 with interrupt vector. The reset of the vectors
> don't get assigned vectors and use old callback mechanism.
> 
> When you go to the VM and run lspci, you will see the device has 16 MSIX
> vectors.
> 
> Do you know which of the MSIX vectors on the guest is the vector I assigned
> for VQ0?

Me as in which component?
And I don't really understand how this answers the question.
If hardware only supports 5 vectors, how can we expose 16
vectors to guest? Host can send to guest as many as it wants,
sure (this is the callback you are referring to, right?)
but host will not know which interrupt to send.
I conclude that exposing to guest more vectors than
hardware supports is simply not something we should do.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization