Re: [RFC] virtio_net: add local_bh_disable() around u64_stats_update_begin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2018/10/17 上午9:13, Toshiaki Makita wrote:
On 2018/10/17 1:55, Sebastian Andrzej Siewior wrote:
on 32bit, lockdep notices:
| ================================
| WARNING: inconsistent lock state
| 4.19.0-rc8+ #9 Tainted: G        W
| --------------------------------
| inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
| ip/1106 [HC0[0]:SC1[1]:HE1:SE0] takes:
| (ptrval) (&syncp->seq#2){+.?.}, at: net_rx_action+0xc8/0x380
| {SOFTIRQ-ON-W} state was registered at:
|   lock_acquire+0x7e/0x170
|   try_fill_recv+0x5fa/0x700
|   virtnet_open+0xe0/0x180
|   __dev_open+0xae/0x130
|   __dev_change_flags+0x17f/0x200
|   dev_change_flags+0x23/0x60
|   do_setlink+0x2bb/0xa20
|   rtnl_newlink+0x523/0x830
|   rtnetlink_rcv_msg+0x14b/0x470
|   netlink_rcv_skb+0x6e/0xf0
|   rtnetlink_rcv+0xd/0x10
|   netlink_unicast+0x16e/0x1f0
|   netlink_sendmsg+0x1af/0x3a0
|   ___sys_sendmsg+0x20f/0x240
|   __sys_sendmsg+0x39/0x80
|   sys_socketcall+0x13a/0x2a0
|   do_int80_syscall_32+0x50/0x180
|   restore_all+0x0/0xb2
| irq event stamp: 3326
| hardirqs last  enabled at (3326): [<c159e6d0>] net_rx_action+0x80/0x380
| hardirqs last disabled at (3325): [<c159e6aa>] net_rx_action+0x5a/0x380
| softirqs last  enabled at (3322): [<c14b440d>] virtnet_napi_enable+0xd/0x60
| softirqs last disabled at (3323): [<c101d63d>] call_on_stack+0xd/0x50
|
| other info that might help us debug this:
|  Possible unsafe locking scenario:
|
|        CPU0
|        ----
|   lock(&syncp->seq#2);
|   <Interrupt>
|     lock(&syncp->seq#2);
|
|  *** DEADLOCK ***
IIUC try_fill_recv is called only when NAPI is disabled from process
context, so there should be no point to race with virtnet_receive which
is called from NAPI handler.

I'm not sure what condition triggered this warning.


Toshiaki Makita


Or maybe NAPI is enabled unexpectedly somewhere?

Btw, the schedule_delayed_work() in virtnet_open() is also suspicious, if the work is executed before virtnet_napi_enable(), there will be a deadloop for napi_disable().

Thanks




This is the "up" path which is not a hotpath. There is also
refill_work().
It might be unwise to add the local_bh_disable() to try_fill_recv()
because if it is used mostly in BH so that local_bh_en+dis might be a
waste of cycles.

Adding local_bh_disable() around try_fill_recv() for the non-BH call
sites would render GFP_KERNEL pointless.

Also, ptr->var++ is not an atomic operation even on 64bit CPUs. Which
means if try_fill_recv() runs on CPU0 (via virtnet_receive()) then the
worker might run on CPU1.

Do we care or is this just stupid stats?  Any suggestions?

This warning appears since commit 461f03dc99cf6 ("virtio_net: Add kick stats").

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
---
  drivers/net/virtio_net.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index dab504ec5e502..d782160cfa882 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1206,9 +1206,11 @@ static bool try_fill_recv(struct virtnet_info *vi, struct receive_queue *rq,
  			break;
  	} while (rq->vq->num_free);
  	if (virtqueue_kick_prepare(rq->vq) && virtqueue_notify(rq->vq)) {
+		local_bh_disable();
  		u64_stats_update_begin(&rq->stats.syncp);
  		rq->stats.kicks++;
  		u64_stats_update_end(&rq->stats.syncp);
+		local_bh_enable();
  	}
return !oom;

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux