I've noticed a very high incidence of the problem I reported a while
back, that manifests itself in open files getting corrupted on commit,
possibly during conditions that involve server disconnections due to
timeouts (very high disk load). Specifically, I've noticed that my
.viminfo file got corrupted for the 3rd time today. Since this is root's
.viminfo, and I'm running glfs as root, I don't have the logs to verify
the disconnections, though. From what I can tell, a chunk of a dll
somehow ends up in .viminfo, but I'm not sure which one.
On a different volume, I'm seeing other weirdness under the same high
disk load conditions (software RAID check/resync on all server nodes).
This seems to be specifically related to using writebehind+iocache on
the client-side on one of he servers, exported via unfsd (the one from
the gluster ftp site). What happens is that the /home volume simply
seems to disappear underneath unfsd! The attached log indicates a
glusterfsd crash.
This doesn't happen if I remove the writebehind and io-cache translators.
Other notable things about the setup that might help figure out the
cause of this:
- The other two servers are idle - they are not serving any requests.
They are, however, also under the same high disk load.
- writebehind and io-cache is only applied on one server the one behing
used to export via unfsd. The other servers do not have those
translators applied. The volume config is attached. It is called
home-cache.vol, but this is the same file the log file refers to even
though it is listed there as home.vol.
The problem specifically occurs when servers are undergoing high load of
the described nature that causes disk latencies to go up massively. I
have not observed any instances of a similar crash happening without the
writebehind and io-cache translators.
Gordan
================================================================================
Version : glusterfs 2.0.9 built on Dec 24 2009 23:13:15
git: v2.0.9
Starting Time: 2010-01-05 11:18:37
Command line : /usr/sbin/glusterfs --log-level=NORMAL --disable-direct-io-mode --volfile=/etc/glusterfs/home.vol /home
PID : 14917
System name : Linux
Nodename : raiden.shatteredsilicon.net
Kernel Release : 2.6.18-164.9.1.el5
Hardware Identifier: x86_64
Given volfile:
+------------------------------------------------------------------------------+
1: volume home1
2: type protocol/client
3: option transport-type socket
4: option transport.address-family inet
5: option remote-host 10.2.0.11
6: option remote-port 6997
7: option remote-subvolume home1
8: end-volume
9:
10: volume home1-writebehind
11: type performance/write-behind
12: option cache-size 2MB # default is equal to aggregate-size
13: option flush-behind on # default is 'off'
14: option enable-O_SYNC on
15: subvolumes home1
16: end-volume
17:
18: volume home1-iocache
19: type performance/io-cache
20: option cache-size 64MB
21: option cache-timeout 2 # default is 1 second
22: subvolumes home1-writebehind
23: end-volume
24:
25: ##############################################################################
26:
27: volume home3
28: type protocol/client
29: option transport-type socket
30: option transport.address-family inet
31: option remote-host 10.2.0.13
32: option remote-port 6997
33: option remote-subvolume home3
34: end-volume
35:
36: volume home3-writebehind
37: type performance/write-behind
38: option cache-size 2MB # default is equal to aggregate-size
39: option flush-behind on # default is 'off'
40: option enable-O_SYNC on
41: subvolumes home3
42: end-volume
43:
44: volume home3-iocache
45: type performance/io-cache
46: option cache-size 64MB
47: option cache-timeout 2 # default is 1 second
48: subvolumes home3-writebehind
49: end-volume
50:
51: ##############################################################################
52:
53: volume home-store
54: type storage/posix
55: option directory /gluster/home
56: end-volume
57:
58: volume home2
59: type features/posix-locks
60: subvolumes home-store
61: end-volume
62:
63: volume home2-writebehind
64: type performance/write-behind
65: option cache-size 2MB # default is equal to aggregate-size
66: option flush-behind on # default is 'off'
67: option enable-O_SYNC on
68: subvolumes home2
69: end-volume
70:
71: volume home2-iocache
72: type performance/io-cache
73: option cache-size 64MB
74: option cache-timeout 2 # default is 1 second
75: subvolumes home2-writebehind
76: end-volume
77:
78: ##############################################################################
79:
80: volume server
81: type protocol/server
82: option transport-type socket
83: option transport.address-family inet
84: option transport.socket.listen-port 6997
85: subvolumes home2
86: option auth.addr.home2.allow 127.0.0.1,10.*
87: end-volume
88:
89: volume home
90: type cluster/replicate
91: subvolumes home2-iocache home1-iocache home3-iocache
92: option read-subvolume home2-iocache
93: option favorite-child home2-iocache
94: end-volume
+------------------------------------------------------------------------------+
[2010-01-05 11:18:37] W [afr.c:2436:init] home: You have specified subvolume 'home2-iocache' as the 'favorite child'. This means that if a discrepancy in the content or attributes (ownership, permission, etc.) of a file is detected among the subvolumes, the file on 'home2-iocache' will be considered the definitive version and its contents will OVERWRITE the contents of the file on other subvolumes. All versions of the file except that on 'home2-iocache' WILL BE LOST.
[2010-01-05 11:18:37] N [afr.c:2194:notify] home: Subvolume 'home2-iocache' came back up; going online.
[2010-01-05 11:18:37] N [afr.c:2194:notify] home: Subvolume 'home2-iocache' came back up; going online.
[2010-01-05 11:18:37] N [afr.c:2194:notify] home: Subvolume 'home2-iocache' came back up; going online.
[2010-01-05 11:18:37] N [glusterfsd.c:1306:main] glusterfs: Successfully started
[2010-01-05 11:18:37] N [client-protocol.c:5733:client_setvolume_cbk] home1: Connected to 10.2.0.11:6997, attached to remote volume 'home1'.
[2010-01-05 11:18:37] N [client-protocol.c:5733:client_setvolume_cbk] home1: Connected to 10.2.0.11:6997, attached to remote volume 'home1'.
[2010-01-05 11:18:37] N [client-protocol.c:5733:client_setvolume_cbk] home3: Connected to 10.2.0.13:6997, attached to remote volume 'home3'.
[2010-01-05 11:18:37] N [client-protocol.c:5733:client_setvolume_cbk] home3: Connected to 10.2.0.13:6997, attached to remote volume 'home3'.
[2010-01-05 11:18:37] N [server-protocol.c:7065:mop_setvolume] server: accepted client from 10.2.0.13:1015
[2010-01-05 11:18:37] N [server-protocol.c:7065:mop_setvolume] server: accepted client from 10.2.0.13:1014
[2010-01-05 11:18:47] N [server-protocol.c:7065:mop_setvolume] server: accepted client from 10.2.0.11:1009
[2010-01-05 11:18:47] N [server-protocol.c:7065:mop_setvolume] server: accepted client from 10.2.0.11:1008
[2010-01-05 11:21:05] E [posix.c:3156:do_xattrop] home-store: getxattr failed on /gordan/.gconf/apps/bluetooth-manager while doing xattrop: No such file or directory
pending frames:
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
patchset: v2.0.9
signal received: 11
time of crash: 2010-01-05 11:21:06
configuration details:
argp 1
backtrace 1
db.h 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.9
/lib64/libc.so.6[0x3f55e302d0]
/lib64/libc.so.6(memcpy+0x15b)[0x3f55e7bf0b]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(__wb_collapse_write_bufs+0x105)[0x2ac6b23bc045]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_process_queue+0xa8)[0x2ac6b23bcc68]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_writev+0x373)[0x2ac6b23bf0e3]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_writev+0x123)[0x2ac6b25c5a13]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_cbk+0x1d6)[0x2ac6b2e2dea6]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_frame_return+0x240)[0x2ac6b25c7390]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_dispatch_requests+0x25b)[0x2ac6b25c4f4b]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_readv+0x1fa)[0x2ac6b25c679a]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_write+0xd7)[0x2ac6b2e2c1c7]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_write_iter+0x5d)[0x2ac6b2e2dc8d]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_write_cbk+0x91)[0x2ac6b2e2dfb1]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_writev_cbk+0x7a)[0x2ac6b25c5aca]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_stack_unwind+0x6a)[0x2ac6b23bbe8a]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_do_ops+0x2c)[0x2ac6b23bcb7c]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_process_queue+0xf4)[0x2ac6b23bccb4]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_writev+0x373)[0x2ac6b23bf0e3]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_writev+0x123)[0x2ac6b25c5a13]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_cbk+0x1d6)[0x2ac6b2e2dea6]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_frame_return+0x240)[0x2ac6b25c7390]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_dispatch_requests+0x25b)[0x2ac6b25c4f4b]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_readv+0x1fa)[0x2ac6b25c679a]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_write+0xd7)[0x2ac6b2e2c1c7]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_write_iter+0x5d)[0x2ac6b2e2dc8d]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_write_cbk+0x91)[0x2ac6b2e2dfb1]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_writev_cbk+0x7a)[0x2ac6b25c5aca]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_stack_unwind+0x6a)[0x2ac6b23bbe8a]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_do_ops+0x2c)[0x2ac6b23bcb7c]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_process_queue+0xf4)[0x2ac6b23bccb4]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_sync_cbk+0xc5)[0x2ac6b23be2b5]
/usr/lib64/glusterfs/2.0.9/xlator/protocol/client.so(client_write_cbk+0x14e)[0x2ac6b21b13de]
/usr/lib64/glusterfs/2.0.9/xlator/protocol/client.so(protocol_client_pollin+0xca)[0x2ac6b21a18aa]
/usr/lib64/glusterfs/2.0.9/xlator/protocol/client.so(notify+0x212)[0x2ac6b21a84e2]
/usr/lib64/glusterfs/2.0.9/transport/socket.so(socket_event_handler+0xd3)[0x2aaaaaaafe33]
/usr/lib64/libglusterfs.so.0[0x3f56a27115]
/usr/sbin/glusterfs(main+0xa06)[0x403e96]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x3f55e1d994]
/usr/sbin/glusterfs[0x402509]
---------
volume home1
type protocol/client
option transport-type socket
option transport.address-family inet
option remote-host 10.2.0.11
option remote-port 6997
option remote-subvolume home1
end-volume
volume home1-writebehind
type performance/write-behind
option cache-size 2MB # default is equal to aggregate-size
option flush-behind on # default is 'off'
option enable-O_SYNC on
subvolumes home1
end-volume
volume home1-iocache
type performance/io-cache
option cache-size 64MB
option cache-timeout 2 # default is 1 second
subvolumes home1-writebehind
end-volume
##############################################################################
volume home3
type protocol/client
option transport-type socket
option transport.address-family inet
option remote-host 10.2.0.13
option remote-port 6997
option remote-subvolume home3
end-volume
volume home3-writebehind
type performance/write-behind
option cache-size 2MB # default is equal to aggregate-size
option flush-behind on # default is 'off'
option enable-O_SYNC on
subvolumes home3
end-volume
volume home3-iocache
type performance/io-cache
option cache-size 64MB
option cache-timeout 2 # default is 1 second
subvolumes home3-writebehind
end-volume
##############################################################################
volume home-store
type storage/posix
option directory /gluster/home
end-volume
volume home2
type features/posix-locks
subvolumes home-store
end-volume
volume home2-writebehind
type performance/write-behind
option cache-size 2MB # default is equal to aggregate-size
option flush-behind on # default is 'off'
option enable-O_SYNC on
subvolumes home2
end-volume
volume home2-iocache
type performance/io-cache
option cache-size 64MB
option cache-timeout 2 # default is 1 second
subvolumes home2-writebehind
end-volume
##############################################################################
volume server
type protocol/server
option transport-type socket
option transport.address-family inet
option transport.socket.listen-port 6997
subvolumes home2
option auth.addr.home2.allow 127.0.0.1,10.*
end-volume
volume home
type cluster/replicate
subvolumes home2-iocache home1-iocache home3-iocache
option read-subvolume home2-iocache
option favorite-child home2-iocache
end-volume