On Tue, Mar 28, 2023 at 12:05 PM Yongji Xie <xieyongji@xxxxxxxxxxxxx> wrote: > > On Tue, Mar 28, 2023 at 11:44 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > > On Tue, Mar 28, 2023 at 11:33 AM Yongji Xie <xieyongji@xxxxxxxxxxxxx> wrote: > > > > > > On Tue, Mar 28, 2023 at 11:14 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > > > > > > On Tue, Mar 28, 2023 at 11:03 AM Yongji Xie <xieyongji@xxxxxxxxxxxxx> wrote: > > > > > > > > > > On Fri, Mar 24, 2023 at 2:28 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Thu, Mar 23, 2023 at 1:31 PM Xie Yongji <xieyongji@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > To support interrupt affinity spreading mechanism, > > > > > > > this makes use of group_cpus_evenly() to create > > > > > > > an irq callback affinity mask for each virtqueue > > > > > > > of vdpa device. Then we will unify set_vq_affinity > > > > > > > callback to pass the affinity to the vdpa device driver. > > > > > > > > > > > > > > Signed-off-by: Xie Yongji <xieyongji@xxxxxxxxxxxxx> > > > > > > > > > > > > Thinking hard of all the logics, I think I've found something interesting. > > > > > > > > > > > > Commit ad71473d9c437 ("virtio_blk: use virtio IRQ affinity") tries to > > > > > > pass irq_affinity to transport specific find_vqs(). This seems a > > > > > > layer violation since driver has no knowledge of > > > > > > > > > > > > 1) whether or not the callback is based on an IRQ > > > > > > 2) whether or not the device is a PCI or not (the details are hided by > > > > > > the transport driver) > > > > > > 3) how many vectors could be used by a device > > > > > > > > > > > > This means the driver can't actually pass a real affinity masks so the > > > > > > commit passes a zero irq affinity structure as a hint in fact, so the > > > > > > PCI layer can build a default affinity based that groups cpus evenly > > > > > > based on the number of MSI-X vectors (the core logic is the > > > > > > group_cpus_evenly). I think we should fix this by replacing the > > > > > > irq_affinity structure with > > > > > > > > > > > > 1) a boolean like auto_cb_spreading > > > > > > > > > > > > or > > > > > > > > > > > > 2) queue to cpu mapping > > > > > > > > > > > > > > > > But only the driver knows which queues are used in the control path > > > > > which don't need the automatic irq affinity assignment. > > > > > > > > Is this knowledge awarded by the transport driver now? > > > > > > > > > > This knowledge is awarded by the device driver rather than the transport driver. > > > > > > E.g. virtio-scsi uses: > > > > > > struct irq_affinity desc = { .pre_vectors = 2 }; // vq0 is control > > > queue, vq1 is event queue > > > > Ok, but it only works as a hint, it's not a real affinity. As replied, > > we can pass an array of boolean in this case then transport driver > > knows it doesn't need to use automatic affinity for the first two > > queues. > > > > But we don't know whether we would use other fields in structure > irq_affinity in the future. So a full set should be better? Good point. So the issue is the calc_sets() and we probably need that if there's a virtio driver that needs more than one set of vectors that needs to be spreaded. Technically, we could have a virtio level abstraction for this but I agree it's probably not worth bothering now. > > > > > > > > E.g virtio-blk uses: > > > > > > > > struct irq_affinity desc = { 0, }; > > > > > > > > Atleast we can tell the transport driver which vq requires automatic > > > > irq affinity. > > > > > > > > > > I think that is what the current implementation does. > > > > > > > > So I think the > > > > > irq_affinity structure can only be created by device drivers and > > > > > passed to the virtio-pci/virtio-vdpa driver. > > > > > > > > This could be not easy since the driver doesn't even know how many > > > > interrupts will be used by the transport driver, so it can't built the > > > > actual affinity structure. > > > > > > > > > > The actual affinity mask is built by the transport driver, > > > > For PCI yes, it talks directly to the IRQ subsystems. > > > > > device > > > driver only passes a hint on which queues don't need the automatic irq > > > affinity assignment. > > > > But not for virtio-vDPA since the IRQ needs to be dealt with by the > > parent driver. For our case, it's the VDUSE where it doesn't need IRQ > > at all, a queue to cpu mapping is sufficient. > > > > The device driver doesn't know whether it is binded to virtio-pci or > virtio-vdpa. So it should pass a full set needed by the automatic irq > affinity assignment instead of a subset. Then virtio-vdpa can choose > to pass a queue to cpu mapping to VDUSE, which is what we do now (use > set_vq_affinity()). Yes, so basically two ways: 1) automatic IRQ management, passing affd to find_vqs(), affinity was determined by the transport (e.g vDPA). 2) affinity that is under the control of the driver, it needs to use set_vq_affinity() but need to deal with cpu hotplug stuffs. Thanks > > Thanks, > Yongji > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization