Re: path 651, it keeps hanging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi again,

well... it is worse than i first thought. glusterfs is not working
anymore. I've already tried to reboot all the servers, nodes and
clients, and the filesystem in not availabe anymore :( .
clients keep saying that there some errors while communicating with
nodes and there's no way the can mount the filesystem :(
i see that i'll have to reinstall all the servers, cause something is
really badly broken, and glusterfs won't work anymore. The servers
haven't hang at all, and all the reboots have been "clean", with only
the glusterfs broken.


El ds 09 de 02 del 2008 a les 01:10 +0100, en/na Jordi Moles Blanco va
escriure:
> hi everyone,
> 
> i'm afraid to bring bad news.
> 
> after applying patch 650, the system seemed to work smoothly for a whole
> under a lot of work.
> I moved to patch 651 as you suggested, and it didn't last more than 3
> hours :(
> 
> the thing is... server didn't hang, but filesystem isn't accessible at
> all, not even an "ls" is possible from any of the clients.
> 
> this are the node logs:
> 
> *************
> nothing logged about this matter!!!
> ************
> 
> and these are the client logs:
> 
> *************
> 2008-02-08 14:56:59 E [unify.c:260:unify_lookup_cbk] ultim: Revalidate
> failed for /dummy/maildirsize
> 2008-02-08 14:56:59 E [fuse-bridge.c:431:fuse_entry_cbk] glusterfs-fuse:
> 202928: /dummy/maildirsize => -1 (2)
> 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2:
> (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2:
> (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [unify.c:802:unify_open_cbk] ultim: Open success
> on namespace, failed on child node
> 2008-02-08 14:56:59 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse:
> 203092: /dummy/maildirsize => -1 (2)
> 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2:
> (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2:
> (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [unify.c:802:unify_open_cbk] ultim: Open success
> on namespace, failed on child node
> 2008-02-08 14:56:59 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse:
> 203093: /dummy/maildirsize => -1 (2)
> 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2:
> (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2:
> (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2
> 2008-02-08 14:56:59 E [unify.c:802:unify_open_cbk] ultim: Open success
> on namespace, failed on child node
> 2008-02-08 14:56:59 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse:
> 203094: /dummy/maildirsize => -1 (2)
> 2008-02-08 14:57:00 E [unify.c:260:unify_lookup_cbk] ultim: Revalidate
> failed for /dummy/maildirsize
> 2008-02-08 14:57:00 E [fuse-bridge.c:431:fuse_entry_cbk] glusterfs-fuse:
> 203247: /dummy/maildirsize => -1 (2)
> 2008-02-08 17:33:05 E [afr.c:2398:afr_selfheal_getxattr_cbk] nm:
> (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2
> 2008-02-08 17:33:05 E [afr.c:2398:afr_selfheal_getxattr_cbk] nm:
> (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2
> 2008-02-08 17:33:05 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2:
> (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2
> 2008-02-08 17:33:05 E [afr.c:2398:afr_selfheal_getxattr_cbk] grup2:
> (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2
> 2008-02-08 17:33:05 E [afr.c:1564:afr_open_cbk] nm:
> (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2
> 2008-02-08 17:33:05 E [afr.c:1564:afr_open_cbk] nm:
> (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2
> 2008-02-08 17:33:05 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2
> 2008-02-08 17:33:05 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2
> 2008-02-08 17:33:05 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse:
> 398639: /dummy/maildirsize => -1 (2)
> 2008-02-08 17:49:46 E [unify.c:260:unify_lookup_cbk] ultim: Revalidate
> failed for /dummy/maildirsize
> 2008-02-08 17:49:46 E [fuse-bridge.c:431:fuse_entry_cbk] glusterfs-fuse:
> 517247: /dummy/maildirsize => -1 (2)
> 2008-02-08 18:03:38 E [afr.c:1564:afr_open_cbk] grup3:
> (path=/dummy/maildirsize child=espai6) op_ret=-1 op_errno=2
> 2008-02-08 18:03:38 E [afr.c:1564:afr_open_cbk] grup3:
> (path=/dummy/maildirsize child=espai5) op_ret=-1 op_errno=2
> 2008-02-08 18:03:38 E [unify.c:802:unify_open_cbk] ultim: Open success
> on namespace, failed on child node
> 2008-02-08 18:03:38 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse:
> 641123: /dummy/maildirsize => -1 (2)
> 2008-02-08 18:30:33 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai3) op_ret=-1 op_errno=2
> 2008-02-08 18:30:33 E [afr.c:1564:afr_open_cbk] grup2:
> (path=/dummy/maildirsize child=espai4) op_ret=-1 op_errno=2
> 2008-02-08 18:30:33 E [afr.c:1564:afr_open_cbk] nm:
> (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2
> 2008-02-08 18:30:33 E [afr.c:1564:afr_open_cbk] nm:
> (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2
> 2008-02-08 18:30:33 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse:
> 894293: /dummy/maildirsize => -1 (2)
> 2008-02-08 18:38:09 E [afr.c:1564:afr_open_cbk] nm:
> (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2
> 2008-02-08 18:38:09 E [afr.c:1564:afr_open_cbk] nm:
> (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2
> 2008-02-08 18:38:09 E [unify.c:794:unify_open_cbk] ultim: Open success
> on child node, failed on namespace
> 2008-02-08 18:38:09 E [fuse-bridge.c:670:fuse_fd_cbk] glusterfs-fuse:
> 969868: /dummy/maildirsize => -1 (2)
> 2008-02-08 19:26:58 E [unify.c:260:unify_lookup_cbk] ultim: Revalidate
> failed for /dummy/maildirsize
> 2008-02-08 19:26:58 E [fuse-bridge.c:431:fuse_entry_cbk] glusterfs-fuse:
> 1302224: /dummy/maildirsize => -1 (2)
> 2008-02-08 19:40:03 E [afr.c:1564:afr_open_cbk] nm:
> (path=/dummy/maildirsize child=namespace2) op_ret=-1 op_errno=2
> 2008-02-08 19:40:03 E [afr.c:1564:afr_open_cbk] nm:
> (path=/dummy/maildirsize child=namespace1) op_ret=-1 op_errno=2
> 2008-02-08 19:40:03 E [unify.c:794:unify_open_cbk] ultim: Open success
> on child node, failed on namespace
> *************
> 
> i don't know if that makes any sense, but here you've got come cats
> 
> dovecots:
> 
> cat /mnt/fusectl/1/waiting >>  6
> 
> postfixs:
> 
> cat /mnt/fusectl/1/waiting >>  14
> 
> these are values at the very moment everything hanged
> 
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxx
> http://lists.nongnu.org/mailman/listinfo/gluster-devel





[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux