On Fri, 15 Apr 2022 13:53:54 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote: > On Fri, Apr 15, 2022 at 10:23 AM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote: > > > > On Thu, 14 Apr 2022 17:30:02 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > On Wed, Apr 13, 2022 at 4:47 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Wed, 13 Apr 2022 16:00:18 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > > > > > > > > 在 2022/4/6 上午11:43, Xuan Zhuo 写道: > > > > > > This patch implements the resize function of the rx, tx queues. > > > > > > Based on this function, it is possible to modify the ring num of the > > > > > > queue. > > > > > > > > > > > > There may be an exception during the resize process, the resize may > > > > > > fail, or the vq can no longer be used. Either way, we must execute > > > > > > napi_enable(). Because napi_disable is similar to a lock, napi_enable > > > > > > must be called after calling napi_disable. > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> > > > > > > --- > > > > > > drivers/net/virtio_net.c | 81 ++++++++++++++++++++++++++++++++++++++++ > > > > > > 1 file changed, 81 insertions(+) > > > > > > > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > > > > > index b8bf00525177..ba6859f305f7 100644 > > > > > > --- a/drivers/net/virtio_net.c > > > > > > +++ b/drivers/net/virtio_net.c > > > > > > @@ -251,6 +251,9 @@ struct padded_vnet_hdr { > > > > > > char padding[4]; > > > > > > }; > > > > > > > > > > > > +static void virtnet_sq_free_unused_buf(struct virtqueue *vq, void *buf); > > > > > > +static void virtnet_rq_free_unused_buf(struct virtqueue *vq, void *buf); > > > > > > + > > > > > > static bool is_xdp_frame(void *ptr) > > > > > > { > > > > > > return (unsigned long)ptr & VIRTIO_XDP_FLAG; > > > > > > @@ -1369,6 +1372,15 @@ static void virtnet_napi_enable(struct virtqueue *vq, struct napi_struct *napi) > > > > > > { > > > > > > napi_enable(napi); > > > > > > > > > > > > + /* Check if vq is in reset state. The normal reset/resize process will > > > > > > + * be protected by napi. However, the protection of napi is only enabled > > > > > > + * during the operation, and the protection of napi will end after the > > > > > > + * operation is completed. If re-enable fails during the process, vq > > > > > > + * will remain unavailable with reset state. > > > > > > + */ > > > > > > + if (vq->reset) > > > > > > + return; > > > > > > > > > > > > > > > I don't get when could we hit this condition. > > > > > > > > > > > > In patch 23, the code to implement re-enable vq is as follows: > > > > > > > > +static int vp_modern_enable_reset_vq(struct virtqueue *vq) > > > > +{ > > > > + struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev); > > > > + struct virtio_pci_modern_device *mdev = &vp_dev->mdev; > > > > + struct virtio_pci_vq_info *info; > > > > + unsigned long flags, index; > > > > + int err; > > > > + > > > > + if (!vq->reset) > > > > + return -EBUSY; > > > > + > > > > + index = vq->index; > > > > + info = vp_dev->vqs[index]; > > > > + > > > > + /* check queue reset status */ > > > > + if (vp_modern_get_queue_reset(mdev, index) != 1) > > > > + return -EBUSY; > > > > + > > > > + err = vp_active_vq(vq, info->msix_vector); > > > > + if (err) > > > > + return err; > > > > + > > > > + if (vq->callback) { > > > > + spin_lock_irqsave(&vp_dev->lock, flags); > > > > + list_add(&info->node, &vp_dev->virtqueues); > > > > + spin_unlock_irqrestore(&vp_dev->lock, flags); > > > > + } else { > > > > + INIT_LIST_HEAD(&info->node); > > > > + } > > > > + > > > > + vp_modern_set_queue_enable(&vp_dev->mdev, index, true); > > > > + > > > > + if (vp_dev->per_vq_vectors && info->msix_vector != VIRTIO_MSI_NO_VECTOR) > > > > + enable_irq(pci_irq_vector(vp_dev->pci_dev, info->msix_vector)); > > > > + > > > > + vq->reset = false; > > > > + > > > > + return 0; > > > > +} > > > > > > > > > > > > There are three situations where an error will be returned. These are the > > > > situations I want to handle. > > > > > > Right, but it looks harmless if we just schedule the NAPI without the check. > > > > Yes. > > > > > > > > > > But I'm rethinking the question, and I feel like you're right, although the > > > > hardware setup may fail. We can no longer sync with the hardware. But using it > > > > as a normal vq doesn't have any problems. > > > > > > Note that we should make sure the buggy(malicous) device won't crash > > > the codes by changing the queue_reset value at its will. > > > > I will keep an eye on this situation. > > > > > > > > > > > > > > > > > > > > > > > > > + > > > > > > /* If all buffers were filled by other side before we napi_enabled, we > > > > > > * won't get another interrupt, so process any outstanding packets now. > > > > > > * Call local_bh_enable after to trigger softIRQ processing. > > > > > > @@ -1413,6 +1425,15 @@ static void refill_work(struct work_struct *work) > > > > > > struct receive_queue *rq = &vi->rq[i]; > > > > > > > > > > > > napi_disable(&rq->napi); > > > > > > + > > > > > > + /* Check if vq is in reset state. See more in > > > > > > + * virtnet_napi_enable() > > > > > > + */ > > > > > > + if (rq->vq->reset) { > > > > > > + virtnet_napi_enable(rq->vq, &rq->napi); > > > > > > + continue; > > > > > > + } > > > > > > > > > > > > > > > Can we do something similar in virtnet_close() by canceling the work? > > > > > > > > I think there is no need to cancel the work here, because napi_disable will wait > > > > for the napi_enable of the resize. So if the re-enable failed vq is used as a normal > > > > vq, this logic can be removed. > > > > > > Actually I meant the part of virtnet_rx_resize(). > > > > > > If we don't synchronize with the refill work, it might enable NAPI unexpectedly? > > > > I don't think this situation will be encountered, because napi_disable is > > mutually exclusive, so there will be no unexpected napi enable. > > > > Is there something I misunderstood? > > So in virtnet_rx_resize() we do: > > napi_disable() > ... > resize() > ... > napi_enalbe() > > How can we guarantee that the work is not run after the napi_disable()? I think you're talking about a situation like this: virtnet_rx_resize refill work ----------------------------------------------------------- napi_disable() ... napi_disable() resize() ... napi_enable() ... napi_enalbe() But in fact: virtnet_rx_resize refill work ----------------------------------------------------------- napi_disable() ... napi_disable() <----[0] resize() | ... | napi_enalbe() | napi_disable() <---- [1] here success napi_enable() Because virtnet_rx_resize() has already executed napi_disable(), napi_disalbe() of [0] will wait until [1] to complete. I'm not sure if my understanding is correct. Thanks. > > Thanks > > > > > Thanks. > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > + > > > > > > still_empty = !try_fill_recv(vi, rq, GFP_KERNEL); > > > > > > virtnet_napi_enable(rq->vq, &rq->napi); > > > > > > > > > > > > @@ -1523,6 +1544,10 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) > > > > > > if (!sq->napi.weight || is_xdp_raw_buffer_queue(vi, index)) > > > > > > return; > > > > > > > > > > > > + /* Check if vq is in reset state. See more in virtnet_napi_enable() */ > > > > > > + if (sq->vq->reset) > > > > > > + return; > > > > > > > > > > > > > > > We've disabled TX napi, any chance we can still hit this? > > > > > > > > Same as above. > > > > > > > > > > > > > > > > > > > > + > > > > > > if (__netif_tx_trylock(txq)) { > > > > > > do { > > > > > > virtqueue_disable_cb(sq->vq); > > > > > > @@ -1769,6 +1794,62 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) > > > > > > return NETDEV_TX_OK; > > > > > > } > > > > > > > > > > > > +static int virtnet_rx_resize(struct virtnet_info *vi, > > > > > > + struct receive_queue *rq, u32 ring_num) > > > > > > +{ > > > > > > + int err; > > > > > > + > > > > > > + napi_disable(&rq->napi); > > > > > > + > > > > > > + err = virtqueue_resize(rq->vq, ring_num, virtnet_rq_free_unused_buf); > > > > > > + if (err) > > > > > > + goto err; > > > > > > + > > > > > > + if (!try_fill_recv(vi, rq, GFP_KERNEL)) > > > > > > + schedule_delayed_work(&vi->refill, 0); > > > > > > + > > > > > > + virtnet_napi_enable(rq->vq, &rq->napi); > > > > > > + return 0; > > > > > > + > > > > > > +err: > > > > > > + netdev_err(vi->dev, > > > > > > + "reset rx reset vq fail: rx queue index: %td err: %d\n", > > > > > > + rq - vi->rq, err); > > > > > > + virtnet_napi_enable(rq->vq, &rq->napi); > > > > > > + return err; > > > > > > +} > > > > > > + > > > > > > +static int virtnet_tx_resize(struct virtnet_info *vi, > > > > > > + struct send_queue *sq, u32 ring_num) > > > > > > +{ > > > > > > + struct netdev_queue *txq; > > > > > > + int err, qindex; > > > > > > + > > > > > > + qindex = sq - vi->sq; > > > > > > + > > > > > > + virtnet_napi_tx_disable(&sq->napi); > > > > > > + > > > > > > + txq = netdev_get_tx_queue(vi->dev, qindex); > > > > > > + __netif_tx_lock_bh(txq); > > > > > > + netif_stop_subqueue(vi->dev, qindex); > > > > > > + __netif_tx_unlock_bh(txq); > > > > > > + > > > > > > + err = virtqueue_resize(sq->vq, ring_num, virtnet_sq_free_unused_buf); > > > > > > + if (err) > > > > > > + goto err; > > > > > > + > > > > > > + netif_start_subqueue(vi->dev, qindex); > > > > > > + virtnet_napi_tx_enable(vi, sq->vq, &sq->napi); > > > > > > + return 0; > > > > > > + > > > > > > +err: > > > > > > > > > > > > > > > I guess we can still start the queue in this case? (Since we don't > > > > > change the queue if resize fails). > > > > > > > > Yes, you are right. > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > + netdev_err(vi->dev, > > > > > > + "reset tx reset vq fail: tx queue index: %td err: %d\n", > > > > > > + sq - vi->sq, err); > > > > > > + virtnet_napi_tx_enable(vi, sq->vq, &sq->napi); > > > > > > + return err; > > > > > > +} > > > > > > + > > > > > > /* > > > > > > * Send command via the control virtqueue and check status. Commands > > > > > > * supported by the hypervisor, as indicated by feature bits, should > > > > > > > > > > > > > > >