(there's probably going to be a TON of wrapped lines, so you can see the unwrapped version at http://paste.fedoraproject.org/11223/80643136/)
* a full packet capture of a situation that causes this problem (358MB compressed)
* logs from gluster, truncated before starting, at TRACE level (88MB compressed)
* a corefile from the gluster/nfs process generated near the end of the test (10GB, compressible to 133MB or so)
My configuration:
Volume Name: gv0
Type: Distributed-Replicate
Volume ID: 52b830b4-f6b0-4103-b90b-ce665dfbafec
Status: Stopped
Number of Bricks: 8 x 2 = 16
Transport-type: tcp
Bricks:
Brick1: fearless1:/export/bricks/500117310007a6d8/glusterdata
Brick2: fearless2:/export/bricks/500117310007a674/glusterdata
Brick3: fearless1:/export/bricks/500117310007a714/glusterdata
Brick4: fearless2:/export/bricks/500117310007a684/glusterdata
Brick5: fearless1:/export/bricks/500117310007a7dc/glusterdata
Brick6: fearless2:/export/bricks/500117310007a694/glusterdata
Brick7: fearless1:/export/bricks/500117310007a7e4/glusterdata
Brick8: fearless2:/export/bricks/500117310007a720/glusterdata
Brick9: fearless1:/export/bricks/500117310007a7ec/glusterdata
Brick10: fearless2:/export/bricks/500117310007a74c/glusterdata
Brick11: fearless1:/export/bricks/500117310007a838/glusterdata
Brick12: fearless2:/export/bricks/500117310007a814/glusterdata
Brick13: fearless1:/export/bricks/500117310007a850/glusterdata
Brick14: fearless2:/export/bricks/500117310007a84c/glusterdata
Brick15: fearless1:/export/bricks/500117310007a858/glusterdata
Brick16: fearless2:/export/bricks/500117310007a8f8/glusterdata
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
performance.nfs.io-threads: on
performance.client-io-threads: on
performance.io-cache: on
performance.write-behind-window-size: 4MB
performance.nfs.write-behind: on
nfs.ports-insecure: on
diagnostics.client-log-level: INFO
Notable "problems":
RPC context pointer, requested by JoeJulian:
(gdb) print *this
$22 = {ops = 0x7f05d50d2580, listener = 0x0, private = 0x27b6d90, xl_private = 0x0, xl = 0x261ffb0, mydata = 0x27a70d0, lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
__kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, refcount = 2, ctx = 0x25dd010, options = 0x7f05d70ba3c8,
name = 0x26215e0 "gv0-client-9", dnscache = 0x28fda50, buf = 0x0, init = 0x7f05d4ec5650 <init>, fini = 0x7f05d4ec6780 <fini>, reconfigure = 0x7f05d4ec56d0 <reconfigure>,
notify = 0x7f05d83f2bd0 <rpc_clnt_notify>, notify_data = 0x0, peerinfo = {sockaddr = {ss_family = 2, __ss_align = 0, __ss_padding = '\000' <repeats 111 times>}, sockaddr_len = 16,
identifier = "192.168.11.2:24037", '\000' <repeats 89 times>}, myinfo = {sockaddr = {ss_family = 2, __ss_align = 0, __ss_padding = '\000' <repeats 111 times>}, sockaddr_len = 16,
identifier = "192.168.11.1:933", '\000' <repeats 91 times>}, total_bytes_read = 134672272, total_bytes_write = 3678711, list = {next = 0x0, prev = 0x0}, bind_insecure = 0}
40 million oustanding frames at this point:
(gdb) print ((call_pool_t)this->ctx->pool)
$23 = {{all_frames = {next = 0x25f96c0, prev = 0x25f9620}, all_stacks = {next_call = 0x25f96c0, prev_call = 0x25f9620}}, cnt = 39705952, lock = 0, frame_mem_pool = 0x0, stack_mem_pool = 0x0}
gluster/nfs gets into a state (very quickly) where it's spending all of it's time spinning somewhere:
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
26240 root 20 0 4287M 2062M 2736 R 1687 1.6 21:53.95 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26656 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:06.76 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26649 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:07.32 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26684 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:08.14 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26682 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:10.35 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26678 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:07.98 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26675 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:07.45 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26679 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:10.32 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26676 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:10.89 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26681 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:06.87 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26677 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:06.87 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26672 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:11.07 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26674 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:07.54 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26680 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:10.12 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26280 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:10.37 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26683 root 20 0 4287M 2062M 2736 R 99.0 1.6 1:08.90 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
26652 root 20 0 4287M 2062M 2736 R 98.0 1.6 1:09.92 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/nfs
6580 michael 20 0 111M 3112 1232 R 3.0 0.0 6:56.91 htop
1611 root 20 0 0 0 0 S 0.0 0.0 0:25.56 fio-wq/0
1647 root 20 0 0 0 0 S 0.0 0.0 0:21.33 fioa-scan
26228 root 20 0 1386M 57112 2172 S 0.0 0.0 0:09.38 /usr/local/glusterfs/sbin/glusterfsd -s localhost --volfile-id gv0.fearles
26267 root 20 0 1386M 57112 2172 S 0.0 0.0 0:01.46 /usr/local/glusterfs/sbin/glusterfsd -s localhost --volfile-id gv0.fearles
26646 root 20 0 1386M 57112 2172 S 0.0 0.0 0:00.59 /usr/local/glusterfs/sbin/glusterfsd -s localhost --volfile-id gv0.fearles
26671 root 20 0 1386M 57112 2172 S 0.0 0.0 0:00.57 /usr/local/glusterfs/sbin/glusterfsd -s localhost --volfile-id gv0.fearles
26685 root 20 0 1386M 57112 2172 S 0.0 0.0 0:00.46 /usr/local/glusterfs/sbin/glusterfsd -s localhost --volfile-id gv0.fearles
26686 root 20 0 1386M 57112 2172 S 0.0 0.0 0:00.25 /usr/local/glusterfs/sbin/glusterfsd -s localhost --volfile-id gv0.fearles
yet the disks are idle:
avg-cpu: %user %nice %system %iowait %steal %idle
18.41 0.00 0.26 0.00 0.00 81.33
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sdf 0.00 0.00 0.00 0.67 0.00 6.17 18.50 0.00 2.00 2.00 0.13
sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda 0.00 0.00 0.00 0.33 0.00 1.33 8.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdd 0.00 0.00 0.00 0.67 0.00 6.17 18.50 0.00 2.00 2.00 0.13
sdi 0.00 0.00 0.00 0.33 0.00 1.33 8.00 0.00 0.00 0.00 0.00
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
fioa 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
and volume profile says that operations aren't being responded to:
NFS Server : localhost
----------------------
Cumulative Stats:
Block Size: 2b+ 8b+ 16b+
No. of Reads: 0 0 0
No. of Writes: 2 6 18
Block Size: 32b+ 64b+ 128b+
No. of Reads: 0 0 0
No. of Writes: 85 30 19
Block Size: 256b+ 512b+ 1024b+
No. of Reads: 0 23 18
No. of Writes: 81 28 54
Block Size: 2048b+ 4096b+ 8192b+
No. of Reads: 6 28 385676
No. of Writes: 21 16 7
Block Size: 16384b+ 32768b+ 65536b+
No. of Reads: 289 135 31300
No. of Writes: 235 0 0
%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
--------- ----------- ----------- ----------- ------------ ----
0.00 0.00 us 0.00 us 0.00 us 1 FORGET
0.00 0.00 us 0.00 us 0.00 us 81 RELEASE
0.00 671.00 us 671.00 us 671.00 us 1 UNLINK
0.00 151.88 us 139.00 us 166.00 us 8 OPEN
0.00 1622.00 us 1622.00 us 1622.00 us 1 RENAME
0.00 411.88 us 345.00 us 502.00 us 8 FSTAT
0.00 1515.50 us 549.00 us 3835.00 us 8 READDIRP
0.00 488.23 us 155.00 us 4467.00 us 30 LK
0.00 832.00 us 211.00 us 3206.00 us 119 SETATTR
0.00 167.32 us 115.00 us 5326.00 us 677 STATFS
0.00 11469.73 us 464.00 us 672289.00 us 74 CREATE
0.00 3335.09 us 60.00 us 3806186.00 us 1200 ACCESS
0.00 130144.47 us 235.00 us 26674232.00 us 256 LOOKUP
0.00 223278.53 us 5.00 us 5554879.00 us 602 WRITE
0.00 582000.45 us 51.00 us 75297085.00 us 278 FLUSH
0.01 129560.57 us 65.00 us 70316547.00 us 3446 STAT
99.98 8454338.20 us 99.00 us 115469515.00 us 417475 READ
Duration: 6021 seconds
Data Read: 5221816212 bytes
Data Written: 4168883 bytes
glusterfs really is taking a long time to respond:
rpc.xid == 0x8879128a
[2013-05-08 21:16:00.564817] T [rpcsvc.c:382:rpcsvc_request_create] 0-rpc-service: received rpc-message (XID: 0x8879128a, Ver: 2, Program: 100003, ProgVers: 3, Proc: 6) from rpc-transport (socket.nfs-server)
[2013-05-08 21:16:00.564860] D [nfs3-helpers.c:1757:nfs3_log_rw_call] 0-nfs-nfsv3: XID: 8879128a, READ: args: FH: exportid 52b830b4-f6b0-4103-b90b-ce665dfbafec, gfid 77c67b0a-57a1-4b56-a6dd-15899529facd, offset: 7456178176, count: 8192
[2013-05-08 21:16:03.161708] T [rpcsvc.c:382:rpcsvc_request_create] 0-rpc-service: received rpc-message (XID: 0x8879128a, Ver: 2, Program: 100003, ProgVers: 3, Proc: 6) from rpc-transport (socket.nfs-server)
[2013-05-08 21:16:03.161784] D [nfs3-helpers.c:1757:nfs3_log_rw_call] 0-nfs-nfsv3: XID: 8879128a, READ: args: FH: exportid 52b830b4-f6b0-4103-b90b-ce665dfbafec, gfid 77c67b0a-57a1-4b56-a6dd-15899529facd, offset: 7456178176, count: 8192
[2013-05-08 21:16:13.940617] T [rpcsvc.c:382:rpcsvc_request_create] 0-rpc-service: received rpc-message (XID: 0x8879128a, Ver: 2, Program: 100003, ProgVers: 3, Proc: 6) from rpc-transport (socket.nfs-server)
[2013-05-08 21:16:13.940662] D [nfs3-helpers.c:1757:nfs3_log_rw_call] 0-nfs-nfsv3: XID: 8879128a, READ: args: FH: exportid 52b830b4-f6b0-4103-b90b-ce665dfbafec, gfid 77c67b0a-57a1-4b56-a6dd-15899529facd, offset: 7456178176, count: 8192
[2013-05-08 21:16:41.028161] T [rpcsvc.c:382:rpcsvc_request_create] 0-rpc-service: received rpc-message (XID: 0x8879128a, Ver: 2, Program: 100003, ProgVers: 3, Proc: 6) from rpc-transport (socket.nfs-server)
[2013-05-08 21:16:41.028205] D [nfs3-helpers.c:1757:nfs3_log_rw_call] 0-nfs-nfsv3: XID: 8879128a, READ: args: FH: exportid 52b830b4-f6b0-4103-b90b-ce665dfbafec, gfid 77c67b0a-57a1-4b56-a6dd-15899529facd, offset: 7456178176, count: 8192
[2013-05-08 21:17:03.172976] D [nfs3-helpers.c:3421:nfs3_log_read_res] 0-nfs-nfsv3: XID: 8879128a, READ: NFS: 0(Call completed successfully.), POSIX: 0(Success), count: 8192, is_eof: 0, vector: count: 1, len: 8192
[2013-05-08 21:17:03.173042] T [rpcsvc.c:1087:rpcsvc_submit_generic] 0-rpc-service: submitted reply for rpc-message (XID: 0x8879128a, Program: NFS3, ProgVers: 3, Proc: 6) to rpc-transport (socket.nfs-server)
[2013-05-08 21:17:40.996580] D [nfs3-helpers.c:3421:nfs3_log_read_res] 0-nfs-nfsv3: XID: 8879128a, READ: NFS: 0(Call completed successfully.), POSIX: 0(Success), count: 8192, is_eof: 0, vector: count: 1, len: 8192
[2013-05-08 21:17:40.996628] T [rpcsvc.c:1087:rpcsvc_submit_generic] 0-rpc-service: submitted reply for rpc-message (XID: 0x8879128a, Program: NFS3, ProgVers: 3, Proc: 6) to rpc-transport (socket.nfs-server)
[2013-05-08 21:17:44.372915] T [rpcsvc.c:382:rpcsvc_request_create] 0-rpc-service: received rpc-message (XID: 0x8879128a, Ver: 2, Program: 100003, ProgVers: 3, Proc: 6) from rpc-transport (socket.nfs-server)
[2013-05-08 21:17:44.372956] D [nfs3-helpers.c:1757:nfs3_log_rw_call] 0-nfs-nfsv3: XID: 8879128a, READ: args: FH: exportid 52b830b4-f6b0-4103-b90b-ce665dfbafec, gfid 77c67b0a-57a1-4b56-a6dd-15899529facd, offset: 7456178176, count: 8192
[2013-05-08 21:18:12.439583] T [rpcsvc.c:382:rpcsvc_request_create] 0-rpc-service: received rpc-message (XID: 0x8879128a, Ver: 2, Program: 100003, ProgVers: 3, Proc: 6) from rpc-transport (socket.nfs-server)
[2013-05-08 21:18:12.439628] D [nfs3-helpers.c:1757:nfs3_log_rw_call] 0-nfs-nfsv3: XID: 8879128a, READ: args: FH: exportid 52b830b4-f6b0-4103-b90b-ce665dfbafec, gfid 77c67b0a-57a1-4b56-a6dd-15899529facd, offset: 7456178176, count: 8192
[2013-05-08 21:19:29.949901] D [nfs3-helpers.c:3421:nfs3_log_read_res] 0-nfs-nfsv3: XID: 8879128a, READ: NFS: 0(Call completed successfully.), POSIX: 0(Success), count: 8192, is_eof: 0, vector: count: 1, len: 8192
[2013-05-08 21:19:29.949953] T [rpcsvc.c:1087:rpcsvc_submit_generic] 0-rpc-service: submitted reply for rpc-message (XID: 0x8879128a, Program: NFS3, ProgVers: 3, Proc: 6) to rpc-transport (socket.nfs-server)
[2013-05-08 21:23:02.353017] D [nfs3-helpers.c:3421:nfs3_log_read_res] 0-nfs-nfsv3: XID: 8879128a, READ: NFS: 0(Call completed successfully.), POSIX: 0(Success), count: 8192, is_eof: 0, vector: count: 1, len: 8192
[2013-05-08 21:23:02.353066] T [rpcsvc.c:1087:rpcsvc_submit_generic] 0-rpc-service: submitted reply for rpc-message (XID: 0x8879128a, Program: NFS3, ProgVers: 3, Proc: 6) to rpc-transport (socket.nfs-server)
--
Michael Brown, Systems Consultant
Net Direct Inc.
☎: +1 519 883 1172 x5106
Michael Brown, Systems Consultant
Net Direct Inc.
☎: +1 519 883 1172 x5106
After a bit of load, I constantly find my gluster server getting into a state where it seems to be unable to reply to NFS RPCs:
http://pastie.org/7803022
Any idea what to do about it?
The NFS daemon also gets rather large - I suspect it's storing up data for all these RPCs:
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
30674 root 20 0 37.6G 37.3G 2288 R 99.0 29.6 5:29.88 /usr/local/glusterfs/sbin/glusterfs
I'm running 3.3.1 with a few patches: https://github.com/Supermathie/glusterfs/tree/release-3.3-oracle
Workload is usually Oracle DNFS.
M.
-- Michael Brown | `One of the main causes of the fall of Systems Consultant | the Roman Empire was that, lacking zero, Net Direct Inc. | they had no way to indicate successful ☎: +1 519 883 1172 x5106 | termination of their C programs.' - Firth