On Sun, Apr 2, 2017 at 10:47 PM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > On Sun, Apr 02, 2017 at 04:10:12PM -0400, Willem de Bruijn wrote: >> From: Willem de Bruijn <willemb@xxxxxxxxxx> >> >> Amortize the cost of virtual interrupts by doing both rx and tx work >> on reception of a receive interrupt if tx napi is enabled. With >> VIRTIO_F_EVENT_IDX, this suppresses most explicit tx completion >> interrupts for bidirectional workloads. >> >> Signed-off-by: Willem de Bruijn <willemb@xxxxxxxxxx> >> --- >> drivers/net/virtio_net.c | 22 ++++++++++++++++++++++ >> 1 file changed, 22 insertions(+) >> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >> index 95d938e82080..af830eb212bf 100644 >> --- a/drivers/net/virtio_net.c >> +++ b/drivers/net/virtio_net.c >> @@ -1030,12 +1030,34 @@ static int virtnet_receive(struct receive_queue *rq, int budget) >> return received; >> } >> >> +static void free_old_xmit_skbs(struct send_queue *sq); >> + > > Could you pls re-arrange code to avoid forward declarations? Okay. I'll do the move in a separate patch to simplify review. >> +static void virtnet_poll_cleantx(struct receive_queue *rq) >> +{ >> + struct virtnet_info *vi = rq->vq->vdev->priv; >> + unsigned int index = vq2rxq(rq->vq); >> + struct send_queue *sq = &vi->sq[index]; >> + struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, index); >> + >> + if (!sq->napi.weight) >> + return; >> + >> + __netif_tx_lock(txq, smp_processor_id()); >> + free_old_xmit_skbs(sq); >> + __netif_tx_unlock(txq); >> + >> + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) >> + netif_wake_subqueue(vi->dev, vq2txq(sq->vq)); >> +} >> + > > Looks very similar to virtnet_poll_tx. > > I think this might be waking the tx queue too early, so > it will tend to stay almost full for long periods of time. > Why not defer wakeup until queue is at least half empty? I'll test that. Delaying wake-up longer than necessary can cause queue build up at the qdisc and higher tail latency, I imagine. But it may reduce the number of __netif_schedule calls. > I wonder whether it's worth it to handle very short queues > correctly - they previously made very slow progress, > not they are never woken up. > > I'm a bit concerned about the cost of these wakeups > and locking. I note that this wake is called basically > every time queue is not full. > > Would it make sense to limit the amount of tx polling? > Maybe use trylock to reduce the conflict with xmit? Yes, that sounds good. I did test that previously and saw no difference then. But when multiple cpus contend for a single txq it should help. _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization