Vijay,
Here are the four server logs and the client log of the machine doing
the rsync. Around 13:35pm i rebooted 10.20.30.44 (brick04) during the
copy to see how self heal is handled. A few of the errors appear to
have happened while brick04 was down, and the remainder after brick04
came back up.
Cheers,
Brian
=
=
=
=
=
=
=
=
========================================================================
Version : glusterfs 2.0.6 built on Aug 25 2009 11:39:00
TLA Revision : v2.0.6
Starting Time: 2009-08-25 12:35:08
Command line : /usr/local/sbin/glusterfsd -p /var/run/glusterfsd.pid -
f /usr/local/etc/glusterfs/glusterfsd.vol
PID : 14121
System name : Linux
Nodename : brick01
Kernel Release : 2.6.28-13-server
Hardware Identifier: i686
Given volfile:
+
------------------------------------------------------------------------------+
1: volume posix
2: type storage/posix
3: option directory /gluster/exports
4: end-volume
5:
6: volume locks
7: type features/locks
8: subvolumes posix
9: end-volume
10:
11: volume brick
12: type performance/io-threads
13: option thread-count 8
14: subvolumes locks
15: end-volume
16:
17: volume server
18: type protocol/server
19: option transport-type tcp
20: option auth.addr.brick.allow *
21: subvolumes brick
22: end-volume
23:
+
------------------------------------------------------------------------------+
[2009-08-25 12:35:08] N [glusterfsd.c:1224:main] glusterfs:
Successfully started
[2009-08-25 12:37:56] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1023
[2009-08-25 12:37:56] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1022
[2009-08-25 12:54:43] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.45:1023
[2009-08-25 12:54:43] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.45:1022
[2009-08-25 13:18:44] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.43:1023
[2009-08-25 13:18:44] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.43:1022
[2009-08-25 13:39:32] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.44:1023
[2009-08-25 13:39:32] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.44:1022
=
=
=
=
=
=
=
=
========================================================================
Version : glusterfs 2.0.6 built on Aug 25 2009 11:51:59
TLA Revision : v2.0.6
Starting Time: 2009-08-25 12:55:55
Command line : /usr/local/sbin/glusterfsd -p /var/run/glusterfsd.pid -
f /usr/local/etc/glusterfs/glusterfsd.vol
PID : 7318
System name : Linux
Nodename : brick05
Kernel Release : 2.6.28-13-server
Hardware Identifier: x86_64
Given volfile:
+
------------------------------------------------------------------------------+
1: volume posix
2: type storage/posix
3: option directory /gluster/exports
4: end-volume
5:
6: volume locks
7: type features/locks
8: subvolumes posix
9: end-volume
10:
11: volume brick
12: type performance/io-threads
13: option thread-count 8
14: subvolumes locks
15: end-volume
16:
17: volume server
18: type protocol/server
19: option transport-type tcp
20: option auth.addr.brick.allow *
21: subvolumes brick
22: end-volume
23:
+
------------------------------------------------------------------------------+
[2009-08-25 12:55:55] N [glusterfsd.c:1224:main] glusterfs:
Successfully started
[2009-08-25 12:55:56] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1016
[2009-08-25 12:55:57] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1017
[2009-08-25 12:56:03] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.45:1017
[2009-08-25 12:56:03] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.45:1016
[2009-08-25 13:18:44] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.43:1017
[2009-08-25 13:18:44] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.43:1016
[2009-08-25 13:36:17] E [posix.c:1307:posix_utimens] posix: lstat on /
gluster/exports/redacted/06/.language_log-
client06.2009-06-01.bz2.tJLu5e failed: No such file or directory
[2009-08-25 13:36:17] E [posix.c:1155:posix_chmod] posix: lstat on /
gluster/exports/redacted/06/.language_log-
client06.2009-06-01.bz2.tJLu5e failed: No such file or directory
[2009-08-25 13:37:27] E [posix.c:1307:posix_utimens] posix: lstat on /
gluster/exports/redacted/07/.error_log-client06.2009-07-18.bz2.KAr0hC
failed: No such file or directory
[2009-08-25 13:37:27] E [posix.c:1147:posix_chmod] posix: chmod on /
redacted/07/.error_log-client06.2009-07-18.bz2.KAr0hC failed: No such
file or directory
[2009-08-25 13:39:32] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.44:1017
[2009-08-25 13:39:37] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.44:1016
[2009-08-25 13:41:29] E [posix.c:1307:posix_utimens] posix: lstat on /
gluster/exports/redacted/07/.language_log-
client06.2009-07-28.bz2.obG8AP failed: No such file or directory
[2009-08-25 13:41:29] E [posix.c:1147:posix_chmod] posix: chmod on /
redacted/07/.language_log-client06.2009-07-28.bz2.obG8AP failed: No
such file or directory
[2009-08-25 13:41:46] E [posix.c:1307:posix_utimens] posix: lstat on /
gluster/exports/redacted/08/.language_log-
client06.2009-08-04.bz2.QJmgip failed: No such file or directory
[2009-08-25 13:41:46] E [posix.c:1147:posix_chmod] posix: chmod on /
redacted/08/.language_log-client06.2009-08-04.bz2.QJmgip failed: No
such file or directory
[2009-08-25 13:41:49] E [posix.c:1307:posix_utimens] posix: lstat on /
gluster/exports/redacted/08/.language_log-
client06.2009-08-06.bz2.vCMsb9 failed: No such file or directory
[2009-08-25 13:41:49] E [posix.c:1147:posix_chmod] posix: chmod on /
redacted/08/.language_log-client06.2009-08-06.bz2.vCMsb9 failed: No
such file or directory
=
=
=
=
=
=
=
=
========================================================================
Version : glusterfs 2.0.6 built on Aug 25 2009 11:53:18
TLA Revision : v2.0.6
Starting Time: 2009-08-25 13:39:30
Command line : /usr/local/sbin/glusterfsd -p /var/run/glusterfsd.pid -
f /usr/local/etc/glusterfs/glusterfsd.vol
PID : 2800
System name : Linux
Nodename : brick04
Kernel Release : 2.6.28-15-server
Hardware Identifier: x86_64
Given volfile:
+
------------------------------------------------------------------------------+
1: volume posix
2: type storage/posix
3: option directory /gluster/exports
4: end-volume
5:
6: volume locks
7: type features/locks
8: subvolumes posix
9: end-volume
10:
11: volume brick
12: type performance/io-threads
13: option thread-count 8
14: subvolumes locks
15: end-volume
16:
17: volume server
18: type protocol/server
19: option transport-type tcp
20: option auth.addr.brick.allow *
21: subvolumes brick
22: end-volume
23:
+
------------------------------------------------------------------------------+
[2009-08-25 13:39:30] N [glusterfsd.c:1224:main] glusterfs:
Successfully started
[2009-08-25 13:39:32] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.44:1021
[2009-08-25 13:39:32] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.44:1020
[2009-08-25 13:39:33] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.45:1021
[2009-08-25 13:39:33] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.43:1021
[2009-08-25 13:39:33] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.43:1020
[2009-08-25 13:39:33] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1021
[2009-08-25 13:39:33] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1020
[2009-08-25 13:39:37] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.45:1020
=
=
=
=
=
=
=
=
========================================================================
Version : glusterfs 2.0.6 built on Aug 25 2009 11:54:38
TLA Revision : v2.0.6
Starting Time: 2009-08-25 11:55:02
Command line : /usr/local/sbin/glusterfsd -f /usr/local/src/buildutils/
staging/ubuntu-9.04/etc/glusterfs/dht-afr-server.vol
PID : 18855
System name : Linux
Nodename : brick03
Kernel Release : 2.6.28-13-server
Hardware Identifier: i686
Given volfile:
+
------------------------------------------------------------------------------+
1: volume posix
2: type storage/posix
3: option directory /gluster/exports
4: end-volume
5:
6: volume locks
7: type features/locks
8: subvolumes posix
9: end-volume
10:
11: volume brick
12: type performance/io-threads
13: option thread-count 8
14: subvolumes locks
15: end-volume
16:
17: volume server
18: type protocol/server
19: option transport-type tcp
20: option auth.addr.brick.allow *
21: subvolumes brick
22: end-volume
23:
+
------------------------------------------------------------------------------+
[2009-08-25 11:55:02] N [glusterfsd.c:1224:main] glusterfs:
Successfully started
[2009-08-25 11:56:49] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1019
[2009-08-25 11:56:49] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1018
[2009-08-25 12:12:24] N [server-protocol.c:7816:notify] server:
10.20.30.41:1019 disconnected
[2009-08-25 12:12:24] N [server-protocol.c:7816:notify] server:
10.20.30.41:1018 disconnected
[2009-08-25 12:12:24] N [server-helpers.c:
779:server_connection_destroy] server: destroyed connection of
brick01-10907-2009/08/25-11:56:48:970540-remote3
[2009-08-25 12:37:56] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1019
[2009-08-25 12:37:56] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.41:1018
[2009-08-25 12:54:43] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.45:1019
[2009-08-25 12:54:43] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.45:1018
[2009-08-25 13:18:44] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.43:1019
[2009-08-25 13:18:44] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.43:1018
[2009-08-25 13:39:32] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.44:1019
[2009-08-25 13:39:32] N [server-protocol.c:7056:mop_setvolume] server:
accepted client from 10.20.30.44:1018
client log from machine doing rsync:
=
=
=
=
=
=
=
=
========================================================================
Version : glusterfs 2.0.6 built on Aug 25 2009 13:17:05
TLA Revision : v2.0.6
Starting Time: 2009-08-25 13:18:44
Command line : /usr/local/sbin/glusterfs --log-level=NORMAL --volfile=/
usr/local/etc/glusterfs/glusterfs.vol /unify
PID : 32202
System name : Linux
Nodename : client06
Kernel Release : 2.6.28-13-server
Hardware Identifier: x86_64
Given volfile:
+
------------------------------------------------------------------------------+
1: volume remote1
2: type protocol/client
3: option transport-type tcp
4: option remote-host brick01
5: option remote-subvolume brick
6: end-volume
7:
8: volume remote2
9: type protocol/client
10: option transport-type tcp
11: option remote-host brick04
12: option remote-subvolume brick
13: end-volume
14:
15: volume remote3
16: type protocol/client
17: option transport-type tcp
18: option remote-host brick03
19: option remote-subvolume brick
20: end-volume
21:
22: volume remote4
23: type protocol/client
24: option transport-type tcp
25: option remote-host brick05
26: option remote-subvolume brick
27: end-volume
28:
29: volume replicate1
30: type cluster/replicate
31: subvolumes remote1 remote2
32: end-volume
33:
34: volume replicate2
35: type cluster/replicate
36: subvolumes remote3 remote4
37: end-volume
38:
39: volume distribute
40: type cluster/distribute
41: subvolumes replicate1 replicate2
42: end-volume
43:
44: volume writebehind
45: type performance/write-behind
46: option window-size 1MB
47: subvolumes distribute
48: end-volume
49:
50: volume cache
51: type performance/io-cache
52: option cache-size 512MB
53: subvolumes writebehind
54: end-volume
+
------------------------------------------------------------------------------+
[2009-08-25 13:18:44] W [xlator.c:555:validate_xlator_volume_options]
writebehind: option 'window-size' is deprecated, preferred is 'cache-
size', continuing with correction
[2009-08-25 13:18:44] N [glusterfsd.c:1224:main] glusterfs:
Successfully started
[2009-08-25 13:18:44] N [client-protocol.c:5559:client_setvolume_cbk]
remote4: Connected to 10.20.30.45:6996, attached to remote volume
'brick'.
[2009-08-25 13:18:44] N [afr.c:2203:notify] replicate2: Subvolume
'remote4' came back up; going online.
[2009-08-25 13:18:44] N [client-protocol.c:5559:client_setvolume_cbk]
remote4: Connected to 10.20.30.45:6996, attached to remote volume
'brick'.
[2009-08-25 13:18:44] N [afr.c:2203:notify] replicate2: Subvolume
'remote4' came back up; going online.
[2009-08-25 13:18:44] N [client-protocol.c:5559:client_setvolume_cbk]
remote3: Connected to 10.20.30.42:6996, attached to remote volume
'brick'.
[2009-08-25 13:18:44] N [client-protocol.c:5559:client_setvolume_cbk]
remote3: Connected to 10.20.30.42:6996, attached to remote volume
'brick'.
[2009-08-25 13:18:44] N [client-protocol.c:5559:client_setvolume_cbk]
remote2: Connected to 10.20.30.44:6996, attached to remote volume
'brick'.
[2009-08-25 13:18:44] N [afr.c:2203:notify] replicate1: Subvolume
'remote2' came back up; going online.
[2009-08-25 13:18:44] N [client-protocol.c:5559:client_setvolume_cbk]
remote2: Connected to 10.20.30.44:6996, attached to remote volume
'brick'.
[2009-08-25 13:18:44] N [afr.c:2203:notify] replicate1: Subvolume
'remote2' came back up; going online.
[2009-08-25 13:18:44] N [client-protocol.c:5559:client_setvolume_cbk]
remote1: Connected to 10.20.30.41:6996, attached to remote volume
'brick'.
[2009-08-25 13:18:44] N [client-protocol.c:5559:client_setvolume_cbk]
remote1: Connected to 10.20.30.41:6996, attached to remote volume
'brick'.
[2009-08-25 13:34:52] N [client-protocol.c:6246:notify] remote2:
disconnected
[2009-08-25 13:35:37] E [socket.c:745:socket_connect_finish] remote2:
connection to 10.20.30.44:6996 failed (No route to host)
[2009-08-25 13:35:37] E [socket.c:745:socket_connect_finish] remote2:
connection to 10.20.30.44:6996 failed (No route to host)
[2009-08-25 13:39:33] N [client-protocol.c:5559:client_setvolume_cbk]
remote2: Connected to 10.20.30.44:6996, attached to remote volume
'brick'.
[2009-08-25 13:39:33] N [client-protocol.c:5559:client_setvolume_cbk]
remote2: Connected to 10.20.30.44:6996, attached to remote volume
'brick'.
On Aug 25, 2009, at 2:00 PM, Vijay Bellur wrote:
Brian Hirt wrote:
Hello Brian,
Can you please send across the complete client and server log files?
Thanks,
Vijay
I'm setting up a test cluster, and i'm trying to copy files from
the local drive to the cluster using rsync. I get intermittent
errors like this every few minutes:
I'm running 2.0.6 on all four servers. My setup is exactly like
this: http://www.gluster.org/docs/index.php/Mixing_DHT_and_AFR
[2009-08-25 13:41:49] E [posix.c:1307:posix_utimens] posix: lstat
on /gluster/exports/.testfile.bz2.vCMsb9 failed: No such file or
directory
[2009-08-25 13:41:49] E [posix.c:1147:posix_chmod] posix: chmod
on /.testfile.bz2.vCMsb9 failed: No such file or directory
It seems like there is some sort of race condition somewhere. Does
anyone have any advice on how to fix this problem?
Thanks!
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel