BUG in xs_tcp_setup_socket in 2.6.38-rc5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ravi was running some automated stress tests on Linux 2.6.38-rc5,
primarily looking for regressions in the sfc driver.  However he saw the
following crash which doesn't look related to the driver.

The test systems mount various directories using NFSv3 and autofs, and
it looks as if sunrpc is handling an error condition wrongly.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa05b5e08>] xs_tcp_setup_socket+0x348/0x4a0 [sunrpc]
PGD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:00.0/irq
CPU 0 
Modules linked in: netconsole configfs nfs lockd fscache nfs_acl auth_rpcgss ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc autofs4 sunrpc be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi cxgb3 ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext2 dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm_intel kvm uinput bnx2 sg dcdbas serio_raw pcspkr iTCO_wdt iTCO_vendor_support i5k_amb i5000_edac edac_core ioatdma dca sfc mtd mdio shpchp ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core [last unloaded: speedstep_lib]
Pid: 10, comm: kworker/0:1 Not tainted 2.6.38-rc5 #3 Dell Inc. PowerEdge 2950/0CX396
RIP: 0010:[<ffffffffa05b5e08>]  [<ffffffffa05b5e08>] xs_tcp_setup_socket+0x348/0x4a0 [sunrpc]
RSP: 0018:ffff880126c11da0  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8801220ee000 RCX: 0000000100016909
RDX: 000000000000001e RSI: ffff880123065a80 RDI: 0000000000000000
RBP: ffff880126c11df0 R08: f018000000000000 R09: febef9edc98abe03
R10: 0000000000000480 R11: 0000000000000000 R12: ffff8801220ee680
R13: ffffe8ffffc0dd00 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8800cf800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 000000012191f000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/0:1 (pid: 10, threadinfo ffff880126c10000, task ffff880126c0f560)
Stack:
 0000000000014e80 ffff880126c0f560 ffff880126c0faf0 00000000c17d7567
 ffff880126c0faf8 ffff8801273cbc40 ffff8800cf811040 ffffe8ffffc0dd00
 ffffffffa05b5ac0 0000000000000000 ffff880126c11e50 ffffffff8107b884
Call Trace:
 [<ffffffffa05b5ac0>] ? xs_tcp_setup_socket+0x0/0x4a0 [sunrpc]
 [<ffffffff8107b884>] process_one_work+0x124/0x430
 [<ffffffff8107e1d1>] worker_thread+0x181/0x3c0
 [<ffffffff8107e050>] ? worker_thread+0x0/0x3c0
 [<ffffffff810828c6>] kthread+0x96/0xa0
 [<ffffffff8100cdc4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81082830>] ? kthread+0x0/0xa0
 [<ffffffff8100cdc0>] ? kernel_thread_helper+0x0/0x10
Code: 0f 1f 00 0f 84 3a ff ff ff e9 4b fe ff ff 0f 1f 44 00 00 41 83 fd 91 0f 85 3c fe ff ff 66 0f 1f 44 00 00 e9 1b ff ff ff 0f 1f 00 <4d> 8b 6e 20 4d 8d bd 68 01 00 00 4c 89 ff e8 45 99 f0 e0 49 8b 
RIP  [<ffffffffa05b5e08>] xs_tcp_setup_socket+0x348/0x4a0 [sunrpc]
 RSP <ffff880126c11da0>
CR2: 0000000000000020
---[ end trace 6efc43bb9b1264f8 ]---

The code dump alone is pretty useless as the IP is at the start of a
block, but having disassembled the entire sunrpc module it appears that
it corresponds to:

static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
{
	struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);

	if (!transport->inet) {
		struct sock *sk = sock->sk; /* <-- this line */

		write_lock_bh(&sk->sk_callback_lock);

I think the bug is that xs_create_sock() returns 0 if xs_bind() fails.

This bug appears to have been introduced in 2.6.37 by:

commit b65c0310611af73569f94c526a1e2323d99b380a
Author: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>
Date:   Mon Oct 4 16:53:46 2010 +0400

    sunrpc: Factor out udp sockets creation

commit 22f793268de3b4dff8abfcd873ba7afc1f34224f
Author: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>
Date:   Mon Oct 4 16:54:26 2010 +0400

    sunrpc: Factor out v4 sockets creation

commit 22d44a7d8a03456aa6d0a047c051aa28728e6ecd
Author: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>
Date:   Mon Oct 4 16:54:55 2010 +0400

    sunrpc: Factor out v6 sockets creation

The following (untested) patch should fix this.

Ben.

---
From: Ben Hutchings <bhutchings@xxxxxxxxxxxxxx>
Date: Tue, 22 Feb 2011 21:49:44 +0000
Subject: [PATCH net-next-2.6] sunrpc: Propagate errors from xs_bind() through xs_create_sock()

xs_create_sock() is supposed to return a pointer or an ERR_PTR-encoded
error, but it currently returns 0 if xs_bind() fails.

Signed-off-by: Ben Hutchings <bhutchings@xxxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxx [v2.6.37]
---
 net/sunrpc/xprtsock.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index c431f5a..be96d42 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1631,7 +1631,8 @@ static struct socket *xs_create_sock(struct rpc_xprt *xprt,
 	}
 	xs_reclassify_socket(family, sock);
 
-	if (xs_bind(transport, sock)) {
+	err = xs_bind(transport, sock);
+	if (err) {
 		sock_release(sock);
 		goto out;
 	}
-- 
1.7.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux