Disconnections and Corruption Under High Load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've noticed a very high incidence of the problem I reported a while back, that manifests itself in open files getting corrupted on commit, possibly during conditions that involve server disconnections due to timeouts (very high disk load). Specifically, I've noticed that my .viminfo file got corrupted for the 3rd time today. Since this is root's .viminfo, and I'm running glfs as root, I don't have the logs to verify the disconnections, though. From what I can tell, a chunk of a dll somehow ends up in .viminfo, but I'm not sure which one.

On a different volume, I'm seeing other weirdness under the same high disk load conditions (software RAID check/resync on all server nodes). This seems to be specifically related to using writebehind+iocache on the client-side on one of he servers, exported via unfsd (the one from the gluster ftp site). What happens is that the /home volume simply seems to disappear underneath unfsd! The attached log indicates a glusterfsd crash.

This doesn't happen if I remove the writebehind and io-cache translators.

Other notable things about the setup that might help figure out the cause of this:

- The other two servers are idle - they are not serving any requests. They are, however, also under the same high disk load.

- writebehind and io-cache is only applied on one server the one behing used to export via unfsd. The other servers do not have those translators applied. The volume config is attached. It is called home-cache.vol, but this is the same file the log file refers to even though it is listed there as home.vol.

The problem specifically occurs when servers are undergoing high load of the described nature that causes disk latencies to go up massively. I have not observed any instances of a similar crash happening without the writebehind and io-cache translators.

Gordan
================================================================================
Version      : glusterfs 2.0.9 built on Dec 24 2009 23:13:15
git: v2.0.9
Starting Time: 2010-01-05 11:18:37
Command line : /usr/sbin/glusterfs --log-level=NORMAL --disable-direct-io-mode --volfile=/etc/glusterfs/home.vol /home 
PID          : 14917
System name  : Linux
Nodename     : raiden.shatteredsilicon.net
Kernel Release : 2.6.18-164.9.1.el5
Hardware Identifier: x86_64

Given volfile:
+------------------------------------------------------------------------------+
  1: volume home1
  2: 	type protocol/client
  3: 	option transport-type socket
  4: 	option transport.address-family inet
  5: 	option remote-host 10.2.0.11
  6: 	option remote-port 6997
  7: 	option remote-subvolume home1
  8: end-volume
  9: 
 10: volume home1-writebehind
 11: 	type performance/write-behind
 12: 	option cache-size 2MB		# default is equal to aggregate-size
 13: 	option flush-behind on		# default is 'off'
 14: 	option enable-O_SYNC on
 15: 	subvolumes home1
 16: end-volume
 17: 
 18: volume home1-iocache
 19: 	type performance/io-cache
 20: 	option cache-size 64MB
 21: 	option cache-timeout 2		# default is 1 second
 22: 	subvolumes home1-writebehind
 23: end-volume
 24: 
 25: ##############################################################################
 26: 
 27: volume home3
 28: 	type protocol/client
 29: 	option transport-type socket
 30: 	option transport.address-family inet
 31: 	option remote-host 10.2.0.13
 32: 	option remote-port 6997
 33: 	option remote-subvolume home3
 34: end-volume
 35: 
 36: volume home3-writebehind
 37: 	type performance/write-behind
 38: 	option cache-size 2MB		# default is equal to aggregate-size
 39: 	option flush-behind on		# default is 'off'
 40: 	option enable-O_SYNC on
 41: 	subvolumes home3
 42: end-volume
 43: 
 44: volume home3-iocache
 45: 	type performance/io-cache
 46: 	option cache-size 64MB
 47: 	option cache-timeout 2		# default is 1 second
 48: 	subvolumes home3-writebehind
 49: end-volume
 50: 
 51: ##############################################################################
 52: 
 53: volume home-store
 54: 	type storage/posix
 55: 	option directory /gluster/home
 56: end-volume
 57: 
 58: volume home2
 59: 	type features/posix-locks
 60: 	subvolumes home-store
 61: end-volume
 62: 
 63: volume home2-writebehind
 64: 	type performance/write-behind
 65: 	option cache-size 2MB		# default is equal to aggregate-size
 66: 	option flush-behind on		# default is 'off'
 67: 	option enable-O_SYNC on
 68: 	subvolumes home2
 69: end-volume
 70: 
 71: volume home2-iocache
 72: 	type performance/io-cache
 73: 	option cache-size 64MB
 74: 	option cache-timeout 2		# default is 1 second
 75: 	subvolumes home2-writebehind
 76: end-volume
 77: 
 78: ##############################################################################
 79: 
 80: volume server
 81: 	type protocol/server
 82: 	option transport-type socket
 83: 	option transport.address-family inet
 84: 	option transport.socket.listen-port 6997
 85: 	subvolumes home2
 86: 	option auth.addr.home2.allow 127.0.0.1,10.*
 87: end-volume
 88: 
 89: volume home
 90: 	type cluster/replicate
 91: 	subvolumes home2-iocache home1-iocache home3-iocache
 92: 	option read-subvolume home2-iocache
 93: 	option favorite-child home2-iocache
 94: end-volume

+------------------------------------------------------------------------------+
[2010-01-05 11:18:37] W [afr.c:2436:init] home: You have specified subvolume 'home2-iocache' as the 'favorite child'. This means that if a discrepancy in the content or attributes (ownership, permission, etc.) of a file is detected among the subvolumes, the file on 'home2-iocache' will be considered the definitive version and its contents will OVERWRITE the contents of the file on other subvolumes. All versions of the file except that on 'home2-iocache' WILL BE LOST.
[2010-01-05 11:18:37] N [afr.c:2194:notify] home: Subvolume 'home2-iocache' came back up; going online.
[2010-01-05 11:18:37] N [afr.c:2194:notify] home: Subvolume 'home2-iocache' came back up; going online.
[2010-01-05 11:18:37] N [afr.c:2194:notify] home: Subvolume 'home2-iocache' came back up; going online.
[2010-01-05 11:18:37] N [glusterfsd.c:1306:main] glusterfs: Successfully started
[2010-01-05 11:18:37] N [client-protocol.c:5733:client_setvolume_cbk] home1: Connected to 10.2.0.11:6997, attached to remote volume 'home1'.
[2010-01-05 11:18:37] N [client-protocol.c:5733:client_setvolume_cbk] home1: Connected to 10.2.0.11:6997, attached to remote volume 'home1'.
[2010-01-05 11:18:37] N [client-protocol.c:5733:client_setvolume_cbk] home3: Connected to 10.2.0.13:6997, attached to remote volume 'home3'.
[2010-01-05 11:18:37] N [client-protocol.c:5733:client_setvolume_cbk] home3: Connected to 10.2.0.13:6997, attached to remote volume 'home3'.
[2010-01-05 11:18:37] N [server-protocol.c:7065:mop_setvolume] server: accepted client from 10.2.0.13:1015
[2010-01-05 11:18:37] N [server-protocol.c:7065:mop_setvolume] server: accepted client from 10.2.0.13:1014
[2010-01-05 11:18:47] N [server-protocol.c:7065:mop_setvolume] server: accepted client from 10.2.0.11:1009
[2010-01-05 11:18:47] N [server-protocol.c:7065:mop_setvolume] server: accepted client from 10.2.0.11:1008
[2010-01-05 11:21:05] E [posix.c:3156:do_xattrop] home-store: getxattr failed on /gordan/.gconf/apps/bluetooth-manager while doing xattrop: No such file or directory
pending frames:
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)

patchset: v2.0.9
signal received: 11
time of crash: 2010-01-05 11:21:06
configuration details:
argp 1
backtrace 1
db.h 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.9
/lib64/libc.so.6[0x3f55e302d0]
/lib64/libc.so.6(memcpy+0x15b)[0x3f55e7bf0b]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(__wb_collapse_write_bufs+0x105)[0x2ac6b23bc045]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_process_queue+0xa8)[0x2ac6b23bcc68]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_writev+0x373)[0x2ac6b23bf0e3]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_writev+0x123)[0x2ac6b25c5a13]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_cbk+0x1d6)[0x2ac6b2e2dea6]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_frame_return+0x240)[0x2ac6b25c7390]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_dispatch_requests+0x25b)[0x2ac6b25c4f4b]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_readv+0x1fa)[0x2ac6b25c679a]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_write+0xd7)[0x2ac6b2e2c1c7]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_write_iter+0x5d)[0x2ac6b2e2dc8d]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_write_cbk+0x91)[0x2ac6b2e2dfb1]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_writev_cbk+0x7a)[0x2ac6b25c5aca]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_stack_unwind+0x6a)[0x2ac6b23bbe8a]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_do_ops+0x2c)[0x2ac6b23bcb7c]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_process_queue+0xf4)[0x2ac6b23bccb4]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_writev+0x373)[0x2ac6b23bf0e3]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_writev+0x123)[0x2ac6b25c5a13]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_cbk+0x1d6)[0x2ac6b2e2dea6]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_frame_return+0x240)[0x2ac6b25c7390]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_dispatch_requests+0x25b)[0x2ac6b25c4f4b]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_readv+0x1fa)[0x2ac6b25c679a]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_write+0xd7)[0x2ac6b2e2c1c7]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_read_write_iter+0x5d)[0x2ac6b2e2dc8d]
/usr/lib64/glusterfs/2.0.9/xlator/cluster/replicate.so(afr_sh_data_write_cbk+0x91)[0x2ac6b2e2dfb1]
/usr/lib64/glusterfs/2.0.9/xlator/performance/io-cache.so(ioc_writev_cbk+0x7a)[0x2ac6b25c5aca]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_stack_unwind+0x6a)[0x2ac6b23bbe8a]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_do_ops+0x2c)[0x2ac6b23bcb7c]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_process_queue+0xf4)[0x2ac6b23bccb4]
/usr/lib64/glusterfs/2.0.9/xlator/performance/write-behind.so(wb_sync_cbk+0xc5)[0x2ac6b23be2b5]
/usr/lib64/glusterfs/2.0.9/xlator/protocol/client.so(client_write_cbk+0x14e)[0x2ac6b21b13de]
/usr/lib64/glusterfs/2.0.9/xlator/protocol/client.so(protocol_client_pollin+0xca)[0x2ac6b21a18aa]
/usr/lib64/glusterfs/2.0.9/xlator/protocol/client.so(notify+0x212)[0x2ac6b21a84e2]
/usr/lib64/glusterfs/2.0.9/transport/socket.so(socket_event_handler+0xd3)[0x2aaaaaaafe33]
/usr/lib64/libglusterfs.so.0[0x3f56a27115]
/usr/sbin/glusterfs(main+0xa06)[0x403e96]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x3f55e1d994]
/usr/sbin/glusterfs[0x402509]
---------

volume home1
	type protocol/client
	option transport-type socket
	option transport.address-family inet
	option remote-host 10.2.0.11
	option remote-port 6997
	option remote-subvolume home1
end-volume

volume home1-writebehind
	type performance/write-behind
	option cache-size 2MB		# default is equal to aggregate-size
	option flush-behind on		# default is 'off'
	option enable-O_SYNC on
	subvolumes home1
end-volume

volume home1-iocache
	type performance/io-cache
	option cache-size 64MB
	option cache-timeout 2		# default is 1 second
	subvolumes home1-writebehind
end-volume

##############################################################################

volume home3
	type protocol/client
	option transport-type socket
	option transport.address-family inet
	option remote-host 10.2.0.13
	option remote-port 6997
	option remote-subvolume home3
end-volume

volume home3-writebehind
	type performance/write-behind
	option cache-size 2MB		# default is equal to aggregate-size
	option flush-behind on		# default is 'off'
	option enable-O_SYNC on
	subvolumes home3
end-volume

volume home3-iocache
	type performance/io-cache
	option cache-size 64MB
	option cache-timeout 2		# default is 1 second
	subvolumes home3-writebehind
end-volume

##############################################################################

volume home-store
	type storage/posix
	option directory /gluster/home
end-volume

volume home2
	type features/posix-locks
	subvolumes home-store
end-volume

volume home2-writebehind
	type performance/write-behind
	option cache-size 2MB		# default is equal to aggregate-size
	option flush-behind on		# default is 'off'
	option enable-O_SYNC on
	subvolumes home2
end-volume

volume home2-iocache
	type performance/io-cache
	option cache-size 64MB
	option cache-timeout 2		# default is 1 second
	subvolumes home2-writebehind
end-volume

##############################################################################

volume server
	type protocol/server
	option transport-type socket
	option transport.address-family inet
	option transport.socket.listen-port 6997
	subvolumes home2
	option auth.addr.home2.allow 127.0.0.1,10.*
end-volume

volume home
	type cluster/replicate
	subvolumes home2-iocache home1-iocache home3-iocache
	option read-subvolume home2-iocache
	option favorite-child home2-iocache
end-volume

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux