Re: [PATCH net] virtio-net: suppress bad irq warning for tx napi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 7, 2021 at 10:29 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
>
>
> On 2021/2/5 上午4:50, Willem de Bruijn wrote:
> > On Wed, Feb 3, 2021 at 10:06 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >>
> >> On 2021/2/4 上午2:28, Willem de Bruijn wrote:
> >>> On Wed, Feb 3, 2021 at 12:33 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >>>> On 2021/2/2 下午10:37, Willem de Bruijn wrote:
> >>>>> On Mon, Feb 1, 2021 at 10:09 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >>>>>> On 2021/1/29 上午8:21, Wei Wang wrote:
> >>>>>>> With the implementation of napi-tx in virtio driver, we clean tx
> >>>>>>> descriptors from rx napi handler, for the purpose of reducing tx
> >>>>>>> complete interrupts. But this could introduce a race where tx complete
> >>>>>>> interrupt has been raised, but the handler found there is no work to do
> >>>>>>> because we have done the work in the previous rx interrupt handler.
> >>>>>>> This could lead to the following warning msg:
> >>>>>>> [ 3588.010778] irq 38: nobody cared (try booting with the
> >>>>>>> "irqpoll" option)
> >>>>>>> [ 3588.017938] CPU: 4 PID: 0 Comm: swapper/4 Not tainted
> >>>>>>> 5.3.0-19-generic #20~18.04.2-Ubuntu
> >>>>>>> [ 3588.017940] Call Trace:
> >>>>>>> [ 3588.017942]  <IRQ>
> >>>>>>> [ 3588.017951]  dump_stack+0x63/0x85
> >>>>>>> [ 3588.017953]  __report_bad_irq+0x35/0xc0
> >>>>>>> [ 3588.017955]  note_interrupt+0x24b/0x2a0
> >>>>>>> [ 3588.017956]  handle_irq_event_percpu+0x54/0x80
> >>>>>>> [ 3588.017957]  handle_irq_event+0x3b/0x60
> >>>>>>> [ 3588.017958]  handle_edge_irq+0x83/0x1a0
> >>>>>>> [ 3588.017961]  handle_irq+0x20/0x30
> >>>>>>> [ 3588.017964]  do_IRQ+0x50/0xe0
> >>>>>>> [ 3588.017966]  common_interrupt+0xf/0xf
> >>>>>>> [ 3588.017966]  </IRQ>
> >>>>>>> [ 3588.017989] handlers:
> >>>>>>> [ 3588.020374] [<000000001b9f1da8>] vring_interrupt
> >>>>>>> [ 3588.025099] Disabling IRQ #38
> >>>>>>>
> >>>>>>> This patch adds a new param to struct vring_virtqueue, and we set it for
> >>>>>>> tx virtqueues if napi-tx is enabled, to suppress the warning in such
> >>>>>>> case.
> >>>>>>>
> >>>>>>> Fixes: 7b0411ef4aa6 ("virtio-net: clean tx descriptors from rx napi")
> >>>>>>> Reported-by: Rick Jones <jonesrick@xxxxxxxxxx>
> >>>>>>> Signed-off-by: Wei Wang <weiwan@xxxxxxxxxx>
> >>>>>>> Signed-off-by: Willem de Bruijn <willemb@xxxxxxxxxx>
> >>>>>> Please use get_maintainer.pl to make sure Michael and me were cced.
> >>>>> Will do. Sorry about that. I suggested just the virtualization list, my bad.
> >>>>>
> >>>>>>> ---
> >>>>>>>      drivers/net/virtio_net.c     | 19 ++++++++++++++-----
> >>>>>>>      drivers/virtio/virtio_ring.c | 16 ++++++++++++++++
> >>>>>>>      include/linux/virtio.h       |  2 ++
> >>>>>>>      3 files changed, 32 insertions(+), 5 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >>>>>>> index 508408fbe78f..e9a3f30864e8 100644
> >>>>>>> --- a/drivers/net/virtio_net.c
> >>>>>>> +++ b/drivers/net/virtio_net.c
> >>>>>>> @@ -1303,13 +1303,22 @@ static void virtnet_napi_tx_enable(struct virtnet_info *vi,
> >>>>>>>                  return;
> >>>>>>>          }
> >>>>>>>
> >>>>>>> +     /* With napi_tx enabled, free_old_xmit_skbs() could be called from
> >>>>>>> +      * rx napi handler. Set work_steal to suppress bad irq warning for
> >>>>>>> +      * IRQ_NONE case from tx complete interrupt handler.
> >>>>>>> +      */
> >>>>>>> +     virtqueue_set_work_steal(vq, true);
> >>>>>>> +
> >>>>>>>          return virtnet_napi_enable(vq, napi);
> >>>>>> Do we need to force the ordering between steal set and napi enable?
> >>>>> The warning only occurs after one hundred spurious interrupts, so not
> >>>>> really.
> >>>> Ok, so it looks like a hint. Then I wonder how much value do we need to
> >>>> introduce helper like virtqueue_set_work_steal() that allows the caller
> >>>> to toggle. How about disable the check forever during virtqueue
> >>>> initialization?
> >>> Yes, that is even simpler.
> >>>
> >>> We still need the helper, as the internal variables of vring_virtqueue
> >>> are not accessible from virtio-net. An earlier patch added the
> >>> variable to virtqueue itself, but I think it belongs in
> >>> vring_virtqueue. And the helper is not a lot of code.
> >>
> >> It's better to do this before the allocating the irq. But it looks not
> >> easy unless we extend find_vqs().
> > Can you elaborate why that is better? At virtnet_open the interrupts
> > are not firing either.
>
>
> I think you meant NAPI actually?

I meant interrupt: we don't have to worry about the spurious interrupt
warning when no interrupts will be firing. Until virtnet_open
completes, the device is down.


>
> >
> > I have no preference. Just curious, especially if it complicates the patch.
> >
>
> My understanding is that. It's probably ok for net. But we probably need
> to document the assumptions to make sure it was not abused in other drivers.
>
> Introduce new parameters for find_vqs() can help to eliminate the subtle
> stuffs but I agree it looks like a overkill.
>
> (Btw, I forget the numbers but wonder how much difference if we simple
> remove the free_old_xmits() from the rx NAPI path?)

The committed patchset did not record those numbers, but I found them
in an earlier iteration:

  [PATCH net-next 0/3] virtio-net tx napi
  https://lists.openwall.net/netdev/2017/04/02/55

It did seem to significantly reduce compute cycles ("Gcyc") at the
time. For instance:

    TCP_RR Latency (us):
    1x:
      p50              24       24       21
      p99              27       27       27
      Gcycles         299      432      308

I'm concerned that removing it now may cause a regression report in a
few months. That is higher risk than the spurious interrupt warning
that was only reported after years of use.
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux