Re: [PATCH] Kernel OOPS in xen_netbk_rx_action / xenvif_gop_skb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Wei Liu,

On 10.07.2014 14:41, Wei Liu wrote:
> On Wed, Jul 02, 2014 at 09:45:44AM +0200, Philipp Hahn wrote:
>> @Wei Liu: You said that the patch is only a quick hack to detect, if my
>> analysis is correct and a proper fix would be needed. For us the
>> attached patch works, as the problem does not happen that often and is
>> hard to reproduce anyway, so spending more time on that issue is
>> probably not worth it. And that flag doesn't look that ugly.
...
> I agree that we would like to avoid spending too much time on this
> issue.

This is also what I'm thinking.

> Since the problem is confirmed, I think a proper fix will be to
> reference count vif and prevent it from unmapping the ring before all
> queued SKBs are consumed.

vif is already ref-counted; see attached *untested* patch for a start.

What I don't like is xenbus_unmap_ring_vfree() potentially printing an
error on double-free when called from xen_netbk_unmap_frontend_rings().

> But it might require much more work than that quick hack.

I'm no network driver/Xen expert, but that sounds like more work for no
gain: We would still copy packets for a guest, which is already dead. If
the ring get full, we would probably need to go to sleep and wait for an
answer we will never get anymore.

> FWIW this bug doesn't exist in kernel >=3.12.

That is even one more point for the hack, as there the problem is
properly fixed and the problem is very obscure to trigger.

> Would you up for writing a patch? I won't be able to write
> one in the near future.
> Further more, you're the only party now can verify a fix.

I had a look and created the attached patch, which is untested, as I
currently can't access the faulting system and have been unable to
reproduce it in my development environment.

The quick hack now runs on that system for several weeks now without
problems.

Sincerely
Philipp
>From b2cb74f337e4af1bd1081249db6825ac3208d09f Mon Sep 17 00:00:00 2001
Message-Id: <b2cb74f337e4af1bd1081249db6825ac3208d09f.1405071570.git.hahn@xxxxxxxxxxxxx>
From: Wei Liu <wei.liu2@xxxxxxxxxx>
Date: Wed, 2 Jul 2014 09:14:22 +0200
Subject: [PATCHv2] xen-netback: unmap only empty shared ring
Organization: Univention GmbH, Bremen, Germany

1. The VM is receiving packets through bonding + bridge + netback +
netfront.

2. For some unknown reason at least one packet remains in the rx queue
and is not delivered to the domU immediately by netback.

3. The VM finishes shutting down.

4. The shared ring between dom0 and domU is freed.

5. then xen-netback continues processing the pending requests and tries
to put the packet into the now already released shared ring.

> [38551.547728] XXXlan0: port 9(vif26.0) entered disabled state
> [38551.549365] BUG: unable to handle kernel paging request at ffffc900108641d8
> [38551.549461] IP: [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0
> [xen_netback]
> [38551.549551] PGD 57e20067 PUD 57e21067 PMD 571a7067 PTE 0
> [38551.549615] Oops: 0000 [#1] SMP
> [38551.549665] Modules linked in: tun xt_physdev xen_blkback xen_netback ip6_tables
> iptable_filter ip_tables ebtable_nat ebtables x_tables xen_gntdev nfsv3 nfsv4
> rpcsec_gss_krb5 nfsd nfs_acl auth_rpcgss oid_registry nfs fscache dns_resolver lockd
> sunrpc fuse loop xen_blkfront xen_evtchn blktap quota_v2 quota_tree xenfs xen_privcmd
> coretemp crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul
> glue_helper aes_x86_64 snd_pcm snd_timer snd soundcore snd_page_alloc tpm_tis tpm lpc_ich
> tpm_bios i7core_edac i2c_i801 psmouse microcode edac_core serio_raw pcspkr mperf ioatdma
> mfd_core processor evdev thermal_sys ext4 jbd2 crc16 bonding bridge stp llc dm_snapshot
> dm_mirror dm_region_hash dm_log dm_mod sd_mod crc_t10dif ehci_pci uhci_hcd ehci_hcd mptsas
> mptscsih mptbase scsi_transport_sas usbcore usb_common igb dca i2c_algo_bit i2c_core ptp
> pps_core button
> [38551.550601] CPU: 0 PID: 12587 Comm: netback/0 Not tainted 3.10.0-ucs58-amd64 #1 Debian
> 3.10.11-1.58.201405060908
> [38551.550693] Hardware name: FUJITSU PRIMERGY BX620 S6/D3051, BIOS 080015 Rev.3C78.3051
> 07/22/2011
> [38551.550781] task: ffff880004b067c0 ti: ffff8800561ec000 task.ti: ffff8800561ec000
> [38551.550865] RIP: e030:[<ffffffffa04147dc>]  [<ffffffffa04147dc>]
> xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
> [38551.550959] RSP: e02b:ffff8800561edce8  EFLAGS: 00010202
> [38551.551009] RAX: ffffc900104adac0 RBX: ffff8800541e95c0 RCX: ffffc90010864000
> [38551.551064] RDX: 000000000000003b RSI: 0000000000000000 RDI: ffff880040014380
> [38551.551120] RBP: ffff8800570e6800 R08: 0000000000000000 R09: ffff880004799800
> [38551.551175] R10: ffffffff813ca115 R11: ffff88005e4fdb08 R12: ffff880054e6f800
> [38551.551231] R13: ffff8800561edd58 R14: ffffc900104a1000 R15: 0000000000000000
> [38551.551289] FS:  00007f19a54a8700(0000) GS:ffff88005da00000(0000)
> knlGS:0000000000000000
> [38551.551374] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [38551.551425] CR2: ffffc900108641d8 CR3: 0000000054cb3000 CR4: 0000000000002660
> [38551.551481] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [38551.551537] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [38551.551592] Stack:
> [38551.551630]  ffff880004b06ba0 0000000000000000 ffff88005da13ec0 ffff88005da13ec0
> [38551.551726]  0000000004b067c0 ffffc900104a8ac0 ffffc900104a1020 000000005da13ec0
> [38551.551823]  0000000000000000 0000000000000001 ffffc900104a8ac0 ffffc900104adac0
> [38551.551920] Call Trace:
> [38551.551966]  [<ffffffff813ca32d>] ? _raw_spin_lock_irqsave+0x11/0x2f
> [38551.552021]  [<ffffffffa0416033>] ? xen_netbk_kthread+0x174/0x841 [xen_netback]
> [38551.552106]  [<ffffffff8105d373>] ? wake_up_bit+0x20/0x20
> [38551.560239]  [<ffffffffa0415ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback]
> [38551.560325]  [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
> [38551.560381]  [<ffffffffa0415ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback]
> [38551.560466]  [<ffffffff8105ce1e>] ? kthread+0xab/0xb3
> [38551.560518]  [<ffffffff81003638>] ? xen_end_context_switch+0xe/0x1c
> [38551.560572]  [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
> [38551.560628]  [<ffffffff813cfbfc>] ? ret_from_fork+0x7c/0xb0
> [38551.560680]  [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56
> [38551.560734] Code: 8b b3 d0 00 00 00 48 8b bb d8 00 00 00 0f b7 74 37 02 89 70 08 eb 07
> c7 40 08 00 00 00 00 89 d2 c7 40 04 00 00 00 00 48 83 c2 08 <0f> b7 34 d1 89 30 c7 44 24
> 60 00 00 00 00 8b 44 d1 04 89 44 24
> [38551.561151] RIP  [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback]
> [38551.561238]  RSP <ffff8800561edce8>
> [38551.561283] CR2: ffffc900108641d8
> [38551.561624] ---[ end trace 8c260c6af259c4aa ]---

Only unmap the ring when no pending packets exist which still reference
the vif.

Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx>
Signed-off-by: Philipp Hahn <hahn@xxxxxxxxxxxxx>
---
v2: Use reference counting instead of boolean flag.
---
 drivers/net/xen-netback/interface.c | 1 +
 drivers/net/xen-netback/netback.c   | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 540a796..16a1d46 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -378,6 +378,7 @@ void xenvif_free(struct xenvif *vif)
 	atomic_dec(&vif->refcnt);
 	wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
 
+	xen_netbk_unmap_frontend_rings(vif);
 	unregister_netdev(vif->dev);
 
 	free_netdev(vif->dev);
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 70b830f..104094d 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1864,6 +1864,9 @@ static int xen_netbk_kthread(void *data)
 
 void xen_netbk_unmap_frontend_rings(struct xenvif *vif)
 {
+	if (atomic_read(&vif->refcnt) != 0)
+		return;
+
 	if (vif->tx.sring)
 		xenbus_unmap_ring_vfree(xenvif_to_xenbus_device(vif),
 					vif->tx.sring);
-- 
1.9.1


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]