Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 23, 2020 at 8:21 AM wangyunjian <wangyunjian@xxxxxxxxxx> wrote:
>
> > -----Original Message-----
> > From: Jason Wang [mailto:jasowang@xxxxxxxxxx]
> > Sent: Wednesday, December 23, 2020 10:54 AM
> > To: Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx>
> > Cc: wangyunjian <wangyunjian@xxxxxxxxxx>; Network Development
> > <netdev@xxxxxxxxxxxxxxx>; Michael S. Tsirkin <mst@xxxxxxxxxx>;
> > virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx; Lilijun (Jerry)
> > <jerry.lilijun@xxxxxxxxxx>; chenchanghu <chenchanghu@xxxxxxxxxx>;
> > xudingke <xudingke@xxxxxxxxxx>; huangbin (J)
> > <brian.huangbin@xxxxxxxxxx>
> > Subject: Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
> >
> >
> > On 2020/12/22 下午10:24, Willem de Bruijn wrote:
> > > On Mon, Dec 21, 2020 at 11:41 PM Jason Wang <jasowang@xxxxxxxxxx>
> > wrote:
> > >>
> > >> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> > >>> On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@xxxxxxxxxx>
> > wrote:
> > >>>> From: Yunjian Wang<wangyunjian@xxxxxxxxxx>
> > >>>>
> > >>>> Currently we break the loop and wake up the vhost_worker when
> > >>>> sendmsg fails. When the worker wakes up again, we'll meet the same
> > >>>> error.
> > >>> The patch is based on the assumption that such error cases always
> > >>> return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
> > >>>
> > >>>> This will cause high CPU load. To fix this issue, we can skip this
> > >>>> description by ignoring the error. When we exceeds sndbuf, the
> > >>>> return value of sendmsg is -EAGAIN. In the case we don't skip the
> > >>>> description and don't drop packet.
> > >>> the -> that
> > >>>
> > >>> here and above: description -> descriptor
> > >>>
> > >>> Perhaps slightly revise to more explicitly state that
> > >>>
> > >>> 1. in the case of persistent failure (i.e., bad packet), the driver
> > >>> drops the packet 2. in the case of transient failure (e.g,. memory
> > >>> pressure) the driver schedules the worker to try again later
> > >>
> > >> If we want to go with this way, we need a better time to wakeup the
> > >> worker. Otherwise it just produces more stress on the cpu that is
> > >> what this patch tries to avoid.
> > > Perhaps I misunderstood the purpose of the patch: is it to drop
> > > everything, regardless of transient or persistent failure, until the
> > > ring runs out of descriptors?
> >
> >
> > My understanding is that the main motivation is to avoid high cpu utilization
> > when sendmsg() fail due to guest reason (e.g bad packet).
> >
>
> My main motivation is to avoid the tx queue stuck.
>
> Should I describe it like this:
> Currently the driver don't drop a packet which can't be send by tun
> (e.g bad packet). In this case, the driver will always process the
> same packet lead to the tx queue stuck.
>
> To fix this issue:
> 1. in the case of persistent failure (e.g bad packet), the driver can skip
> this descriptior by ignoring the error.
> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the driver
> schedules the worker to try again.

That sounds good to me, thanks.

> Thanks
>
> >
> > >
> > > I can understand both a blocking and drop strategy during memory
> > > pressure. But partial drop strategy until exceeding ring capacity
> > > seems like a peculiar hybrid?
> >
> >
> > Yes. So I wonder if we want to be do better when we are in the memory
> > pressure. E.g can we let socket wake up us instead of rescheduling the
> > workers here? At least in this case we know some memory might be freed?

I don't know whether a blocking or drop strategy is the better choice.
Either way, it probably deserves to be handled separately.
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux