Re: [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Aug 20, 2017 at 4:49 PM, Willem de Bruijn
<willemdebruijn.kernel@xxxxxxxxx> wrote:
> On Sat, Aug 19, 2017 at 2:38 AM, Koichiro Den <den@xxxxxxxxxxxxx> wrote:
>> Facing the possible unbounded delay relying on freeing on xmit path,
>> we also better to invoke and clear the upper layer zerocopy callback
>> beforehand to keep them from waiting for unbounded duration in vain.
>
> Good point.
>
>> For instance, this removes the possible deadlock in the case that the
>> upper layer is a zerocopy-enabled vhost-net.
>> This does not apply if napi_tx is enabled since it will be called in
>> reasonale time.
>
> Indeed. Btw, I am gathering data to eventually make napi the default
> mode. But that is taking some time.
>
>>
>> Signed-off-by: Koichiro Den <den@xxxxxxxxxxxxx>
>> ---
>>  drivers/net/virtio_net.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 4302f313d9a7..f7deaa5b7b50 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -1290,6 +1290,14 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>>
>>         /* Don't wait up for transmitted skbs to be freed. */
>>         if (!use_napi) {
>> +               if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
>> +                       struct ubuf_info *uarg;
>> +                       uarg = skb_shinfo(skb)->destructor_arg;
>> +                       if (uarg->callback)
>> +                           uarg->callback(uarg, true);
>> +                       skb_shinfo(skb)->destructor_arg = NULL;
>> +                       skb_shinfo(skb)->tx_flags &= ~SKBTX_DEV_ZEROCOPY;
>> +               }
>
> Instead of open coding, this can use skb_zcopy_clear.

It is not correct to only send the zerocopy completion here, as
the skb will still be shared with the nic. The only safe approach
to notifying early is to create a copy with skb_orphan_frags_rx.
That will call skb_zcopy_clear(skb, false) internally. But the
copy will be very expensive.

Is the deadlock you refer to the case where tx interrupts are
disabled, so transmit completions are only handled in start_xmit
and somehow the socket(s) are unable to send new data? The
question is what is blocking them. If it is zerocopy, it may be a
too low optmem_max or hitting the per-user locked pages limit.
We may have to keep interrupts enabled when zerocopy skbs
are in flight.
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization



[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux