I did a few tla replay --reverse operations and found that patch level 258
works fine (except for previously reported fchmod and acl issues). replay
to 259, and it breaks as below. The posix cleanup patch breaks in my
setup.
Thanks,
Brent
On Tue, 29 Jul 2008, Brent A Nelson wrote:
I had to make the ip->addr change a number of checkouts ago. I hadn't yet
switch from tcp/client and tcp/server to socket, as backwards compatibility
seemed to work fine. I just made the change, but as expected (since the
client is obviously communicating with all the servers; for example, df
information is correct), it didn't help.
Other then this common complaint:
2008-07-29 12:04:18 C [dict.c:1141:data_to_str] dict: @data=(nil)
I have nothing in the server logs. However, I'm not sure how useful the
server logs are, as I run 4-5 server processes per machine, and they all use
the same log location.
This setup (which is a set of four machines, 4 exports per machine, 2
machines offering namespace, clientside AFR+unify) was working fine with a
checkout that was probably about a week old.
It's possible that it's due to some changes I made to the kernel of my build
machine to try to get shared writable mmap support into my fuse, but those
patches were pretty specific, and I wouldn't expect it to cause this kind of
behavior.
I'll try to figure out how to get tla to roll back to a particular patchset
and see if I can identify which patch causes the breakage.
Thanks,
Brent
On Tue, 29 Jul 2008, Raghavendra G wrote:
Hi Brent,
There are couple of changes in 1.4. The authentication module "ip" have
been
renamed as "addr". so the server-volume-spec file should have,
auth.addr.<brick-name>.allow <list-of-addresses>
list-of-addresses depends on the address-family specified in the
transport/socket. it can be,
ip-address for inet/inet6/inet-sdp
path for unix
Do the server side logs say that "no authentication module is interested in
authenticating client xxxx"? If thats the case, the above fix works. If
not,
can you send server side logs?
regards,
On Mon, Jul 28, 2008 at 11:23 PM, Brent A Nelson <brent@xxxxxxxxxxxx>
wrote:
The latest checkout seems to have a major defect, in my setup. On the
bright side, the fchmod bug seems like it might be fixed (although it
could
be that the filesystem isn't working well enough to tell)...
ls -al /beast
ls: cannot access /beast/vz: No such file or directory
ls: cannot access /beast/openvz: No such file or directory
ls: cannot access /beast/usr0: No such file or directory
ls: cannot access /beast/lost+found: No such file or directory
ls: reading directory /beast: File descriptor in bad state
total 128
drwxr-xr-x 6 root root 20480 2008-07-28 15:02 .
drwxrwxrwx 28 4791 kmem 4096 2008-07-18 20:32 ..
d????????? ? ? ? ? ? lost+found
-rwxr-xr-x 1 root root 92376 2008-04-04 02:42 ls
d????????? ? ? ? ? ? openvz
d????????? ? ? ? ? ? usr0
d????????? ? ? ? ? ? vz
Associated glusterfs.log:
2008-07-28 15:08:10 E [socket.c:1186:socket_submit] share4-0: transport
not
connected to submit (priv->connected = 0)
2008-07-28 15:08:10 E [afr.c:3428:afr_statfs_cbk] mirror4:
(child=share4-0)
op_ret=-1 op_errno=107(Transport endpoint is not connected)
2008-07-28 15:08:59 C [client-protocol.c:223:call_bail] ns0-0: bailing
transport2008-07-28 15:08:59 C [client-protocol.c:223:call_bail] ns0-1:
bailing transport2008-07-28 15:08:59 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-0: forced unwinding
frame type(1) op(34) reply=@0xb4b02790
2008-07-28 15:08:59 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(1) op(34) reply=@0xb4b025e0
2008-07-28 15:08:59 E [socket.c:1186:socket_submit] ns0-0: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:08:59 E [socket.c:1186:socket_submit] ns0-1: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:08:59 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
18: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:09:49 C [client-protocol.c:223:call_bail] ns0-0: bailing
transport2008-07-28 15:09:49 C [client-protocol.c:223:call_bail] ns0-1:
bailing transport2008-07-28 15:09:49 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-0: forced unwinding
frame type(2) op(0) reply=@0xb4b010d0
2008-07-28 15:09:49 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:09:49 E [client-protocol.c:3980:client_setvolume_cbk] ns0-0:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:09:49 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-0: forced unwinding frame type(1) op(34) reply=@0xb4b010d0
2008-07-28 15:09:49 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(2) op(0) reply=@0xb4b010d0
2008-07-28 15:09:49 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:09:49 E [client-protocol.c:3980:client_setvolume_cbk] ns0-1:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:09:49 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(1) op(34) reply=@0xb4b010d0
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
18: (op_num=34) / => -1 (No such file or directory)
2008-07-28 15:09:49 E [socket.c:1186:socket_submit] ns0-0: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:09:49 E [socket.c:1186:socket_submit] ns0-1: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
19: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
19: (op_num=34) / => -1 (No such file or directory)
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
20: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
20: (op_num=34) / => -1 (No such file or directory)
2008-07-28 15:09:49 E [afr.c:4180:afr_readdir_cbk] ns0: (child=ns0-1)
op_ret=-1 op_errno=77(File descriptor in bad state)
2008-07-28 15:09:49 E [fuse-bridge.c:1947:fuse_readdir_cbk]
glusterfs-fuse:
21: READDIR => -1 (File descriptor in bad state)
2008-07-28 15:09:49 E [afr.c:5641:afr_closedir] ns0: child_errno[] not 0,
returning ENOTCONN
2008-07-28 15:09:49 E [fuse-bridge.c:940:fuse_err_cbk] glusterfs-fuse: 22:
(op_num=24) ERR => -1 (Transport endpoint is not connected)
2008-07-28 15:10:42 C [client-protocol.c:223:call_bail] ns0-0: bailing
transport2008-07-28 15:10:42 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-0: forced unwinding
frame type(2) op(0) reply=@0x80a9cc8
2008-07-28 15:10:42 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:10:42 E [client-protocol.c:3980:client_setvolume_cbk] ns0-0:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:10:42 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-0: forced unwinding frame type(1) op(34) reply=@0x80a9cc8
2008-07-28 15:10:42 C [client-protocol.c:223:call_bail] ns0-1: bailing
transport2008-07-28 15:10:42 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-1: forced unwinding
frame type(2) op(0) reply=@0x80a5b48
2008-07-28 15:10:42 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:10:42 E [client-protocol.c:3980:client_setvolume_cbk] ns0-1:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:10:42 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(1) op(34) reply=@0x80a5b48
2008-07-28 15:10:42 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
23: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:10:42 E [socket.c:1186:socket_submit] ns0-0: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:10:42 E [socket.c:1186:socket_submit] ns0-1: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:10:42 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
23: (op_num=34) / => -1 (No such file or directory)
2008-07-28 15:11:32 C [client-protocol.c:223:call_bail] ns0-1: bailing
transport2008-07-28 15:11:32 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-1: forced unwinding
frame type(2) op(0) reply=@0x80ab308
2008-07-28 15:11:32 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:11:32 E [client-protocol.c:3980:client_setvolume_cbk] ns0-1:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:11:32 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(1) op(34) reply=@0x80ab308
2008-07-28 15:11:32 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
24: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:11:32 E [socket.c:1186:socket_submit] ns0-1: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:11:35 C [client-protocol.c:223:call_bail] ns0-0: bailing
transport2008-07-28 15:11:35 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-0: forced unwinding
frame type(2) op(0) reply=@0x80a9cc8
2008-07-28 15:11:35 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:11:35 E [client-protocol.c:3980:client_setvolume_cbk] ns0-0:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:11:35 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-0: forced unwinding frame type(1) op(34) reply=@0x80a9cc8
2008-07-28 15:11:35 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
24: (op_num=34) / => -1 (No such file or directory)
Also, trying to shut down after this test, the filesystem unmounts fine,
and most of the share glusterfsd processes were killed normally, but I had
to kill -9 the namespace glusterfsd processes.
Thanks,
Brent
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel
--
Raghavendra G
A centipede was happy quite, until a toad in fun,
Said, "Prey, which leg comes after which?",
This raised his doubts to such a pitch,
He fell flat into the ditch,
Not knowing how to run.
-Anonymous