Re: [PATCH v4 00/27] add block layout driver to pnfs client

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jul 30, 2011 at 11:26 AM, Jim Rees <rees@xxxxxxxxx> wrote:
> Trond Myklebust wrote:
>
>  >   Looks like we did find a bug in NFS.
>  >
>  > It kind of looks that way.
>
>  Is that reproducible on the upstream kernel, or is it something that is
>  being introduced by the pNFS blocks code?
>
> It happens without the blocks module loaded, but it could be from something
> we did outside the module.  I will test this weekend when I get a chance.
I tried xfstests again and was able to reproduce a hang on both block
layout and file layout (upstream commit ed1e62, w/o block layout
code). It seems it is a bug in pnfs code. I did not see it w/ NFSv4.
For pnfs block and file layout, it can be reproduced by just running
xfstests with ./check -nfs. It does not show up every time but is
likely to happen in less than 10 runs.  Not sure if it is the same one
Jim reported though.

block layout trace:
[  660.039009] BUG: soft lockup - CPU#1 stuck for 22s! [10.244.82.74-ma:29730]
[  660.039014] Modules linked in: blocklayoutdriver nfs lockd fscache
auth_rpcgss nfs_acl ebtable_na
t ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM
iptable_mangle bridge stp llc sunrpc be2isc
si ip6t_REJECT iscsi_boot_sysfs nf_conntrack_ipv6 nf_defrag_ipv6 bnx2i
ip6table_filter cnic uio ip6_
tables cxgb3i libcxgbi cxgb3 mdio iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi ppdev i2c_pii
x4 i2c_core pcspkr e1000 parport_pc microcode parport vmw_balloon
shpchp ipv6 floppy mptspi mptscsih
 mptbase scsi_transport_spi [last unloaded: nfs]
[  660.039014] CPU 1
[  660.039014] Modules linked in: blocklayoutdriver nfs lockd fscache
auth_rpcgss nfs_acl ebtable_na
t ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM
iptable_mangle bridge stp llc sunrpc be2isc
si ip6t_REJECT iscsi_boot_sysfs nf_conntrack_ipv6 nf_defrag_ipv6 bnx2i
ip6table_filter cnic uio ip6_
tables cxgb3i libcxgbi cxgb3 mdio iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi ppdev i2c_pii
x4 i2c_core pcspkr e1000 parport_pc microcode parport vmw_balloon
shpchp ipv6 floppy mptspi mptscsih
 mptbase scsi_transport_spi [last unloaded: nfs]
[  660.039014]
[  660.039014] Pid: 29730, comm: 10.244.82.74-ma Tainted: G      D
3.0.0-pnfs+ #2 VMware, Inc. V
Mware Virtual Platform/440BX Desktop Reference Platform
[  660.039014] RIP: 0010:[<ffffffff81084f49>]  [<ffffffff81084f49>]
do_raw_spin_lock+0x1e/0x25
[  660.039014] RSP: 0018:ffff88001fef5e60  EFLAGS: 00000297
[  660.039014] RAX: 000000000000002b RBX: ffff88003be19000 RCX: 0000000000000001
[  660.039014] RDX: 000000000000002a RSI: ffff8800219a7cf0 RDI: ffff880020e4d988
[  660.039014] RBP: ffff88001fef5e60 R08: 0000000000000000 R09: 000000000000df20
[  660.039014] R10: 0000000000000000 R11: ffff8800219a7c00 R12: ffff88001fef5df0
[  660.039014] R13: 00000000c355df1b R14: ffff88003bfaeac0 R15: ffff8800219a7c00
[  660.039014] FS:  0000000000000000(0000) GS:ffff88003fd00000(0000)
knlGS:0000000000000000
[  660.039014] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  660.039014] CR2: 00007fc6122a4000 CR3: 0000000001a04000 CR4: 00000000000006e0
[  660.039014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  660.039014] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  660.039014] Process 10.244.82.74-ma (pid: 29730, threadinfo
ffff88001fef4000, task ffff88001fca80
00)
[  660.039014] Stack:
[  660.039014]  ffff88001fef5e70 ffffffff814585ee ffff88001fef5e90
ffffffffa02badee
[  660.039014]  0000000000000000 ffff8800219a7c00 ffff88001fef5ee0
ffffffffa02bc2d9
[  660.039014]  ffff880000000000 ffffffffa02d2250 ffff88001fef5ee0
ffff88002059ba10
[  660.039014] Call Trace:
[  660.039014]  [<ffffffff814585ee>] _raw_spin_lock+0xe/0x10
[  660.039014]  [<ffffffffa02badee>] nfs4_begin_drain_session+0x24/0x8f [nfs]
[  660.039014]  [<ffffffffa02bc2d9>] nfs4_run_state_manager+0x271/0x517 [nfs]
[  660.039014]  [<ffffffffa02bc068>] ? nfs4_do_reclaim+0x422/0x422 [nfs]
[  660.039014]  [<ffffffff810719bf>] kthread+0x84/0x8c
[  660.039014]  [<ffffffff81460f54>] kernel_thread_helper+0x4/0x10
[  660.039014]  [<ffffffff8107193b>] ? kthread_worker_fn+0x148/0x148
[  660.039014]  [<ffffffff81460f50>] ? gs_change+0x13/0x13
[  660.039014] Code: 00 00 10 00 74 05 e8 a7 59 1b 00 5d c3 55 48 89
e5 66 66 66 66 90 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2
74 07 f3 90 <0f> b7 17 eb f5 5d c3 55 48 89 e5 66 66 66 66 90 8b 07 89
c2 c1
[  660.039014] Call Trace:
[  660.039014]  [<ffffffff814585ee>] _raw_spin_lock+0xe/0x10
[  660.039014]  [<ffffffffa02badee>] nfs4_begin_drain_session+0x24/0x8f [nfs]
[  660.039014]  [<ffffffffa02bc2d9>] nfs4_run_state_manager+0x271/0x517 [nfs]
[  660.039014]  [<ffffffffa02bc068>] ? nfs4_do_reclaim+0x422/0x422 [nfs]
[  660.039014]  [<ffffffff810719bf>] kthread+0x84/0x8c
[  660.039014]  [<ffffffff81460f54>] kernel_thread_helper+0x4/0x10
[  660.039014]  [<ffffffff8107193b>] ? kthread_worker_fn+0x148/0x148
[  660.039014]  [<ffffffff81460f50>] ? gs_change+0x13/0x13


file layout trace:
[19716.049009] BUG: soft lockup - CPU#1 stuck for 23s! [10.244.82.76-ma:29036]
[19716.049011] Modules linked in: nfs_layout_nfsv41_files nfs lockd
fscache auth_rpcgss nfs_acl ebtable_nat ebtables ipt_MASQUERADE
iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc sunrpc
ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 be2iscsi iscsi_boot_sysfs
bnx2i cnic ip6table_filter uio ip6_tables cxgb3i libcxgbi cxgb3 mdio
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev microcode
i2c_piix4 e1000 vmw_balloon parport_pc parport shpchp pcspkr i2c_core
ipv6 mptspi mptscsih mptbase scsi_transport_spi floppy [last unloaded:
nfs]
[19716.049011] CPU 1
[19716.049011] Modules linked in: nfs_layout_nfsv41_files nfs lockd
fscache auth_rpcgss nfs_acl ebtable_nat ebtables ipt_MASQUERADE
iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc sunrpc
ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 be2iscsi iscsi_boot_sysfs
bnx2i cnic ip6table_filter uio ip6_tables cxgb3i libcxgbi cxgb3 mdio
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev microcode
i2c_piix4 e1000 vmw_balloon parport_pc parport shpchp pcspkr i2c_core
ipv6 mptspi mptscsih mptbase scsi_transport_spi floppy [last unloaded:
nfs]
[19716.049011]
[19716.049011] Pid: 29036, comm: 10.244.82.76-ma Tainted: G      D
3.0.0-pnfs+ #2 VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform
[19716.049011] RIP: 0010:[<ffffffff81084f49>]  [<ffffffff81084f49>]
do_raw_spin_lock+0x1e/0x25
[19716.049011] RSP: 0018:ffff88002a69be60  EFLAGS: 00000297
[19716.049011] RAX: 0000000000000005 RBX: ffff88002a59fd00 RCX: 0000000000000002
[19716.049011] RDX: 0000000000000004 RSI: ffff8800208c00f0 RDI: ffff8800208c1188
[19716.049011] RBP: ffff88002a69be60 R08: 0000000000000002 R09: 0000ffff00066c0a
[19716.049011] R10: 0000ffff00066c0a R11: ffff8800208c0000 R12: ffff88002a69bdf0
[19716.049011] R13: 0000000001ce15a2 R14: ffff88002a6f1f80 R15: ffff8800208c0000
[19716.049011] FS:  0000000000000000(0000) GS:ffff88003fd00000(0000)
knlGS:0000000000000000
[19716.049011] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[19716.049011] CR2: 00007fad5ac53000 CR3: 0000000038784000 CR4: 00000000000006e0
[19716.049011] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[19716.049011] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[19716.049011] Process 10.244.82.76-ma (pid: 29036, threadinfo
ffff88002a69a000, task ffff880022db9720)
[19716.049011] Stack:
[19716.049011]  ffff88002a69be70 ffffffff814585ee ffff88002a69be90
ffffffffa02be836
[19716.049011]  0000000000000002 ffff8800208c0000 ffff88002a69bee0
ffffffffa02bfd21
[19716.049011]  ffff880000000000 ffffffffa02d59c0 ffff88002a69bee0
ffff880037971ce8
[19716.049011] Call Trace:
[19716.049011]  [<ffffffff814585ee>] _raw_spin_lock+0xe/0x10
[19716.049011]  [<ffffffffa02be836>] nfs4_begin_drain_session+0x24/0x8f [nfs]
[19716.049011]  [<ffffffffa02bfd21>] nfs4_run_state_manager+0x271/0x517 [nfs]
[19716.049011]  [<ffffffffa02bfab0>] ? nfs4_do_reclaim+0x422/0x422 [nfs]
[19716.049011]  [<ffffffff810719bf>] kthread+0x84/0x8c
[19716.049011]  [<ffffffff81460f54>] kernel_thread_helper+0x4/0x10
[19716.049011]  [<ffffffff8107193b>] ? kthread_worker_fn+0x148/0x148
[19716.049011]  [<ffffffff81460f50>] ? gs_change+0x13/0x13
[19716.049011] Code: 00 00 10 00 74 05 e8 a7 59 1b 00 5d c3 55 48 89
e5 66 66 66 66 90 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2
74 07 f3 90 <0f> b7 17 eb f5 5d c3 55 48 89 e5 66 66 66 66 90 8b 07 89
c2 c1
[19716.049011] Call Trace:
[19716.049011] Call Trace:
[19716.049011]  [<ffffffff814585ee>] _raw_spin_lock+0xe/0x10
[19716.049011]  [<ffffffffa02be836>] nfs4_begin_drain_session+0x24/0x8f [nfs]
[19716.049011]  [<ffffffffa02bfd21>] nfs4_run_state_manager+0x271/0x517 [nfs]
[19716.049011]  [<ffffffffa02bfab0>] ? nfs4_do_reclaim+0x422/0x422 [nfs]
[19716.049011]  [<ffffffff810719bf>] kthread+0x84/0x8c
[19716.049011]  [<ffffffff81460f54>] kernel_thread_helper+0x4/0x10
[19716.049011]  [<ffffffff8107193b>] ? kthread_worker_fn+0x148/0x148
[19716.049011]  [<ffffffff81460f50>] ? gs_change+0x13/0x13

-- 
Thanks,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux