Hi again, well... it is worse than i first thought. glusterfs is not working anymore. I've already tried to reboot all the servers, nodes and clients, and the filesystem in not availabe anymore :( . clients keep saying that there some errors while communicating with nodes and there's no way the can mount the filesystem :( i see that i'll have to reinstall all the servers, cause something is really badly broken, and glusterfs won't work anymore. The servers haven't hang at all, and all the reboots have been "clean", with only the glusterfs broken. El ds 09 de 02 del 2008 a les 01:10 +0100, en/na Jordi Moles Blanco va escriure: > hi everyone, > > i'm afraid to bring bad news. > > after applying patch 650, the system seemed to work smoothly for a whole > under a lot of work. > I moved to patch 651 as you suggested, and it didn't last more than 3 > hours :( > > the thing is... server didn't hang, but filesystem isn't accessible at > all, not even an "ls" is possible from any of the clients. > > this are the node logs: > > ************* > nothing logged about this matter!!! > ************ > > and these are the client logs: > > ************* > 2008-02-08 14:56:59 E [unify.c:260:unify_lookup_cbk] ultim: Revalidate > failed for /dummy/maildirsize > 2008-02-08 14:56:59 E [fuse-bridge.c:431:fuse_entry_cbk] glusterfs-fuse: > 202928: /dummy/maildirsize => -1 (2) > 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2: > (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2: > (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [unify.c:802:unify_open_cbk] ultim: Open success > on namespace, failed on child node > 2008-02-08 14:56:59 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse: > 203092: /dummy/maildirsize => -1 (2) > 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2: > (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2: > (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [unify.c:802:unify_open_cbk] ultim: Open success > on namespace, failed on child node > 2008-02-08 14:56:59 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse: > 203093: /dummy/maildirsize => -1 (2) > 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2: > (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2: > (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2 > 2008-02-08 14:56:59 E [unify.c:802:unify_open_cbk] ultim: Open success > on namespace, failed on child node > 2008-02-08 14:56:59 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse: > 203094: /dummy/maildirsize => -1 (2) > 2008-02-08 14:57:00 E [unify.c:260:unify_lookup_cbk] ultim: Revalidate > failed for /dummy/maildirsize > 2008-02-08 14:57:00 E [fuse-bridge.c:431:fuse_entry_cbk] glusterfs-fuse: > 203247: /dummy/maildirsize => -1 (2) > 2008-02-08 17:33:05 E [afr.c:2398:afr_selfheal_getxattr_cbk] nm: > (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2 > 2008-02-08 17:33:05 E [afr.c:2398:afr_selfheal_getxattr_cbk] nm: > (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2 > 2008-02-08 17:33:05 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2: > (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2 > 2008-02-08 17:33:05 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2: > (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2 > 2008-02-08 17:33:05 E [afr.c:1564:afr_open_cbk] nm: > (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2 > 2008-02-08 17:33:05 E [afr.c:1564:afr_open_cbk] nm: > (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2 > 2008-02-08 17:33:05 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2 > 2008-02-08 17:33:05 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2 > 2008-02-08 17:33:05 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse: > 398639: /dummy/maildirsize => -1 (2) > 2008-02-08 17:49:46 E [unify.c:260:unify_lookup_cbk] ultim: Revalidate > failed for /dummy/maildirsize > 2008-02-08 17:49:46 E [fuse-bridge.c:431:fuse_entry_cbk] glusterfs-fuse: > 517247: /dummy/maildirsize => -1 (2) > 2008-02-08 18:03:38 E [afr.c:1564:afr_open_cbk] grup3: > (path=/dummy/maildirsize child=espai6) op_ret=-1 op_errno=2 > 2008-02-08 18:03:38 E [afr.c:1564:afr_open_cbk] grup3: > (path=/dummy/maildirsize child=espai5) op_ret=-1 op_errno=2 > 2008-02-08 18:03:38 E [unify.c:802:unify_open_cbk] ultim: Open success > on namespace, failed on child node > 2008-02-08 18:03:38 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse: > 641123: /dummy/maildirsize => -1 (2) > 2008-02-08 18:30:33 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2 > 2008-02-08 18:30:33 E [afr.c:1564:afr_open_cbk] grup2: > (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2 > 2008-02-08 18:30:33 E [afr.c:1564:afr_open_cbk] nm: > (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2 > 2008-02-08 18:30:33 E [afr.c:1564:afr_open_cbk] nm: > (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2 > 2008-02-08 18:30:33 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse: > 894293: /dummy/maildirsize => -1 (2) > 2008-02-08 18:38:09 E [afr.c:1564:afr_open_cbk] nm: > (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2 > 2008-02-08 18:38:09 E [afr.c:1564:afr_open_cbk] nm: > (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2 > 2008-02-08 18:38:09 E [unify.c:794:unify_open_cbk] ultim: Open success > on child node, failed on namespace > 2008-02-08 18:38:09 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse: > 969868: /dummy/maildirsize => -1 (2) > 2008-02-08 19:26:58 E [unify.c:260:unify_lookup_cbk] ultim: Revalidate > failed for /dummy/maildirsize > 2008-02-08 19:26:58 E [fuse-bridge.c:431:fuse_entry_cbk] glusterfs-fuse: > 1302224: /dummy/maildirsize => -1 (2) > 2008-02-08 19:40:03 E [afr.c:1564:afr_open_cbk] nm: > (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2 > 2008-02-08 19:40:03 E [afr.c:1564:afr_open_cbk] nm: > (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2 > 2008-02-08 19:40:03 E [unify.c:794:unify_open_cbk] ultim: Open success > on child node, failed on namespace > ************* > > i don't know if that makes any sense, but here you've got come cats > > dovecots: > > cat /mnt/fusectl/1/waiting >> 6 > > postfixs: > > cat /mnt/fusectl/1/waiting >> 14 > > these are values at the very moment everything hanged > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx > http://lists.nongnu.org/mailman/listinfo/gluster-devel