It happens only when I try to download some files on a gluster filesystem using sftp. It works fine with scp and rsync but crash always with sftp. I have tried with different ssh client on different OSes and I get always same results.
Gluster server and client are running on Ubuntu Hardy X86_64, with latest update.
With glusterfs 2.0.0 final this is the log when it crashes:
pending frames:
frame : type(1) op(READ)
patchset: 7b2e459db65edd302aa12476bc73b3b7a17b1410
signal received: 11
configuration details:argp 1
backtrace 1
bdb->cursor->get 1
db.h 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.0
/lib/libc.so.6[0x7f2c73993100]
/usr/local/lib/glusterfs/2.0.0/xlator/protocol/client.so(client_readv_cbk+0x15d)[0x7f2c72f588cd]
/usr/local/lib/glusterfs/2.0.0/xlator/protocol/client.so(protocol_client_pollin+0xca)[0x7f2c72f470ca]
/usr/local/lib/glusterfs/2.0.0/xlator/protocol/client.so(notify+0x10b)[0x7f2c72f4e75b]
/usr/local/lib/glusterfs/2.0.0/transport/socket.so(socket_event_handler+0xcb)[0x7f2c71eb7aeb]
/usr/local/lib/libglusterfs.so.0[0x7f2c7410abda]
/usr/local/sbin/glusterfs(main+0xa38)[0x403aa8]
/lib/libc.so.6(__libc_start_main+0xf4)[0x7f2c7397f1c4]
/usr/local/sbin/glusterfs[0x4026c9]
client config was on my previous post, here is server config:
volume posix
type storage/posix
option directory /var/glusterfs
end-volume
volume locks
type features/posix-locks
option mandatory-locks on
subvolumes posix
end-volume
volume readahead
type performance/read-ahead
option page-size 128kB # 256KB is the default option
option page-count 4 # 2 is default option
option force-atime-update off # default is off
subvolumes locks
end-volume
volume brick
type performance/io-threads
option thread-count 2
subvolumes readahead
end-volume
volume server
type protocol/server
option transport-type tcp
option auth.addr.brick.allow 192.168.100.*,127.0.0.1 # Edit and add list of allowed clients comma separated IP addrs(names) here
option auth.addr.locks.allow 192.168.100.*,127.0.0.1 # Edit and add list of allowed clients comma separated IP addrs(names) here
subvolumes brick
end-volume
The two involved server act both as server and client.
The crash is reproducible on both server.
Enrico
2009/4/30 Enrico Zaffaroni <ezaffa@xxxxxxxxx>
I had this morning a similar situation. Here is my log.
Till this morning (30/4) it was working fine . I'm running now 2.0.0rc8
2009-04-21 22:22:31 N [glusterfsd.c:1152:main] glusterfs: Successfully started
2009-04-21 22:22:31 N [client-protocol.c:6327:client_setvolume_cbk] brick1: connection and handshake succeeded
2009-04-21 22:22:31 N [afr.c:2126:notify] afr: subvolume brick1 came up
2009-04-21 22:22:31 N [client-protocol.c:6327:client_setvolume_cbk] brick1: connection and handshake succeeded
2009-04-21 22:22:31 N [afr.c:2126:notify] afr: subvolume brick1 came up
2009-04-21 22:22:31 N [client-protocol.c:6327:client_setvolume_cbk] brick2: connection and handshake succeeded
2009-04-21 22:22:31 N [afr.c:2126:notify] afr: subvolume brick2 came up
2009-04-21 22:22:31 N [client-protocol.c:6327:client_setvolume_cbk] brick2: connection and handshake succeeded
2009-04-21 22:22:31 N [afr.c:2126:notify] afr: subvolume brick2 came up
2009-04-22 06:35:07 W [afr-self-heal-common.c:1161:sh_missing_entries_lookup_cbk] afr: path /wordpress/wp-content/uploads/js_cache/tinymce_a12248cba65d26d1835b9b337723f188.gz on subvolume brick2 => -1 (No such file or directory)
2009-04-22 06:35:07 W [afr-self-heal-data.c:646:afr_sh_data_open_cbk] afr: sourcing file /wordpress/wp-content/uploads/js_cache/tinymce_a12248cba65d26d1835b9b337723f188.gz from brick1 to other sinks
pending frames:
frame : type(1) op(READ)
patchset: 82394d484803e02e28441bc0b988efaaff60dd94
signal received: 11bdb->cursor->get 1
configuration details:argp 1
backtrace 1
db.h 1/lib/libc.so.6[0x7fbb7ffb3100]
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.0rc8
/usr/local/lib/glusterfs/2.0.0rc8/xlator/protocol/client.so(client_readv_cbk+0x15d)[0x7fbb7f5788cd]
/usr/local/lib/glusterfs/2.0.0rc8/xlator/protocol/client.so(protocol_client_pollin+0xca)[0x7fbb7f5670ca]
/usr/local/lib/glusterfs/2.0.0rc8/xlator/protocol/client.so(notify+0x10b)[0x7fbb7f56e75b]
/usr/local/lib/glusterfs/2.0.0rc8/transport/socket.so(socket_event_handler+0xcb)[0x7fbb7eaf1aeb]
/usr/local/lib/libglusterfs.so.0[0x7fbb8072abea]
/usr/local/sbin/glusterfs(main+0xa38)[0x403aa8]
/lib/libc.so.6(__libc_start_main+0xf4)[0x7fbb7ff9f1c4]
/usr/local/sbin/glusterfs[0x4026c9]
---------
================================================================================
Client configuration is very simple:
volume brick2
type protocol/client
option transport-type tcp/client # for TCP/IP transport
option remote-host 192.168.100.39 # IP address of server2
option remote-subvolume brick # name of the remote volume on server2
end-volume
volume brick1
type protocol/client
option transport-type tcp/client # for TCP/IP transport
option remote-host 192.168.100.38 # IP address of server1
option remote-subvolume brick # name of the remote volume on server1
end-volume
volume afr
type cluster/afr
option metadata-self-heal on
subvolumes brick1 brick2
end-volume
Enrico
2009/4/26 JV <jamson@xxxxxxxxxx>Hello.
There wasn't anything interesting:
================================================================================
Version : glusterfs 2.0.0rc8 built on Apr 20 2009 23:04:47
TLA Revision : 82394d484803e02e28441bc0b988efaaff60dd94
Starting Time: 2009-04-23 17:17:31
Command line : /usr/local/sbin/glusterfs --log-level=NORMAL --volfile=/etc/glusterfs/glusterfs-client.vol /
mnt/glusterfs
PID : 3759
System name : Linux
Nodename : f1
Kernel Release : 2.6.29.1
Hardware Identifier: i686
Given volfile:
+------------------------------------------------------------------------------+
1: volume g1b1
2: type protocol/client
3: option transport-type tcp/client
4: option remote-host 10.10.70.2
5: option remote-subvolume brick1
6: end-volume
7: volume g1b2
8: type protocol/client
9: option transport-type tcp
10: option remote-host 10.10.70.2
11: option remote-subvolume brick2
12: end-volume
13: volume g1b3
14: type protocol/client
15: option transport-type tcp
16: option remote-host 10.10.70.2
17: option remote-subvolume brick3
18: end-volume
19:
20: volume g2b1
21: type protocol/client
22: option transport-type tcp/client
23: option remote-host 10.10.70.3
24: option remote-subvolume brick1
25: end-volume
26: volume g2b2
27: type protocol/client
28: option transport-type tcp
29: option remote-host 10.10.70.3
30: option remote-subvolume brick2
31: end-volume
32: volume g2b3
33: type protocol/client
34: option transport-type tcp
35: option remote-host 10.10.70.3
36: option remote-subvolume brick3
37: end-volume
38:
39: volume replicate1
40: type cluster/replicate
41: subvolumes g1b1 g2b1
42: end-volume
43:
44: volume replicate2
45: type cluster/replicate
46: subvolumes g1b2 g2b2
47: end-volume
48:
49: volume replicate3
50: type cluster/replicate
51: subvolumes g1b3 g2b3
52: end-volume
53:
54: volume distribute
55: type cluster/distribute
56: subvolumes replicate1 replicate2 replicate3
57: end-volume
58:
+------------------------------------------------------------------------------+
2009-04-23 17:17:31 N [glusterfsd.c:1152:main] glusterfs: Successfully started
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g1b1: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate1: subvolume g1b1 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g1b1: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate1: subvolume g1b1 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g2b1: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate1: subvolume g2b1 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g1b2: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate2: subvolume g1b2 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g2b1: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate1: subvolume g2b1 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g1b2: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate2: subvolume g1b2 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g2b2: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate2: subvolume g2b2 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g2b2: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate2: subvolume g2b2 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g1b3: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate3: subvolume g1b3 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g2b3: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate3: subvolume g2b3 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g1b3: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate3: subvolume g1b3 came up
2009-04-23 17:17:31 N [client-protocol.c:6327:client_setvolume_cbk] g2b3: connection and handshake succeeded
2009-04-23 17:17:31 N [afr.c:2126:notify] replicate3: subvolume g2b3 came up[0xb803c400]
pending frames:
patchset: 82394d484803e02e28441bc0b988efaaff60dd94
signal received: 6
configuration details:argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.0rc8
/lib/i686/cmov/libc.so.6(abort+0x188)[0xb7eb3018]
/lib/i686/cmov/libc.so.6(__assert_fail+0xee)[0xb7eaa5be]
/lib/i686/cmov/libpthread.so.0(pthread_mutex_lock+0x5d4)[0xb7fe8f54]
/usr/lib/libfuse.so.2[0xb75c0425]
/usr/local/lib/glusterfs/2.0.0rc8/xlator/mount/fuse.so[0xb75d519a]
/usr/lib/libfuse.so.2[0xb75c1cf8]
/usr/lib/libfuse.so.2[0xb75c2211]
/usr/lib/libfuse.so.2(fuse_session_process+0x26)[0xb75c3bf6]
/usr/local/lib/glusterfs/2.0.0rc8/xlator/mount/fuse.so[0xb75d5f31]
/lib/i686/cmov/libpthread.so.0[0xb7fe74c0]
/lib/i686/cmov/libc.so.6(clone+0x5e)[0xb7f666de]
---------
Anyhow - it seems to be working without problems using
git version 689347f278e5acfda95a24f7804a1450043311f3
Also it seems to be working faster without any performance translators, however gluster is using only part of dedicated GigE link - no more than 25% for writing and 30% for reading on client. Half that on each storage node. What can I do to improve this?
P.S. sorry for the late response - your message got into junk folder.
JV
Anand Avati wrote:
There has to be some more before this. Can you paste the full log?
Avati
in client log is only this:
pending frames:
patchset: 82394d484803e02e28441bc0b988efaaff60dd94
signal received: 6
configuration details:argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.0rc8
[0xb8067400]
/lib/i686/cmov/libc.so.6(abort+0x188)[0xb7ede018]
/lib/i686/cmov/libc.so.6(__assert_fail+0xee)[0xb7ed55be]
/lib/i686/cmov/libpthread.so.0(pthread_mutex_lock+0x5d4)[0xb8013f54]
/usr/lib/libfuse.so.2[0xb75d3147]
/usr/lib/libfuse.so.2(fuse_session_process+0x26)[0xb75d4bf6]
/usr/local/lib/glusterfs/2.0.0rc8/xlator/mount/fuse.so[0xb75eef31]
/lib/i686/cmov/libpthread.so.0[0xb80124c0]
/lib/i686/cmov/libc.so.6(clone+0x5e)[0xb7f916de]
---------
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel
--
-------------
Enrico Zaffaroni
ezaffa@xxxxxxxxx
--
-------------
Enrico Zaffaroni
ezaffa@xxxxxxxxx