After some testing I found out that my backup using rsync caused the error 'endpoint not connected'. Stopping the cron job everything seemed to be ok. Is there a way to take backup snapshots from the volumes by gluster itself? ----------------------------------------------- EDV Daniel M?ller Leitung EDV Tropenklinik Paul-Lechler-Krankenhaus Paul-Lechler-Str. 24 72076 T?bingen Tel.: 07071/206-463, Fax: 07071/206-499 eMail: mueller at tropenklinik.de Internet: www.tropenklinik.de ----------------------------------------------- -----Urspr?ngliche Nachricht----- Von: Daniel M?ller [mailto:mueller at tropenklinik.de] Gesendet: Donnerstag, 28. M?rz 2013 14:17 An: 'mueller at tropenklinik.de'; 'Pranith Kumar K' Cc: 'Reinhard Marstaller'; 'gluster-users at gluster.org' Betreff: AW: Glusterfs gives up with endpoint not connected The tird part, output /var/log/messages concerning raid5 hs: [root at tuepdc /]# tail -f /var/log/messages Mar 28 13:21:32 tuepdc kernel: SCSI device sdd: drive cache: write back Mar 28 13:21:32 tuepdc kernel: SCSI device sde: 1953525168 512-byte hdwr sectors (1000205 MB) Mar 28 13:21:32 tuepdc kernel: sde: Write Protect is off Mar 28 13:21:32 tuepdc kernel: SCSI device sde: drive cache: write back Mar 28 13:21:32 tuepdc kernel: SCSI device sdf: 1953525168 512-byte hdwr sectors (1000205 MB) Mar 28 13:21:32 tuepdc kernel: sdf: Write Protect is off Mar 28 13:21:32 tuepdc kernel: SCSI device sdf: drive cache: write back Mar 28 13:21:32 tuepdc kernel: SCSI device sdg: 1953525168 512-byte hdwr sectors (1000205 MB) Mar 28 13:21:32 tuepdc kernel: sdg: Write Protect is off Mar 28 13:21:32 tuepdc kernel: SCSI device sdg: drive cache: write back ----------------------------------------------- EDV Daniel M?ller Leitung EDV Tropenklinik Paul-Lechler-Krankenhaus Paul-Lechler-Str. 24 72076 T?bingen Tel.: 07071/206-463, Fax: 07071/206-499 eMail: mueller at tropenklinik.de Internet: www.tropenklinik.de ----------------------------------------------- -----Urspr?ngliche Nachricht----- Von: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] Im Auftrag von Daniel M?ller Gesendet: Donnerstag, 28. M?rz 2013 13:57 An: 'Pranith Kumar K' Cc: 'Reinhard Marstaller'; gluster-users at gluster.org Betreff: Re: Glusterfs gives up with endpoint not connected Now part to of raid5hs-glusterfs-export.log attr (utimes) on /raid5hs/glusterfs/export/windows/winuser/schneider/schneider/V erwaltung/baumma<DF>nahmen/Bauvorhaben Umsetzung/Parkierung/SKIZZE_TPLK_Lageplan .pdf failed: Read-only file system pending frames: patchset: v3.2.0 signal received: 11 time of crash: 2013-03-25 22:50:46 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.2.0 /lib64/libc.so.6[0x30c0a302d0] /opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/features/marker.so(marker_ setattr_cbk+0x139)[0x2aaaaba9de79] /opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/performance/io-threads.so( iot_setattr_cbk+0x88)[0x2aaaab88d718] /opt/glusterfs/3.2.0/lib64/libglusterfs.so.0(default_setattr_cbk+0x88)[0x2b1 a834a5f28] /opt/glusterfs/3.2.0/lib64/libglusterfs.so.0(default_setattr_cbk+0x88)[0x2b1 a834a5f28] /opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/storage/posix.so(posix_set attr+0x1fc)[0x2aaaab2560bc] /opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/features/access-control.so (ac_setattr_resume+0xe9)[0x2aaaab469039] /opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/features/access-control.so (ac_setattr+0x49)[0x2aaaab46a979] /opt/glusterfs/3.2.0/lib64/libglusterfs.so.0(default_setattr+0xe9)[0x2b1a834 9f659] /opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/performance/io-threads.so( iot_setattr_wrapper+0xe9)[0x2aaaab890749] /opt/glusterfs/3.2.0/lib64/libglusterfs.so.0(call_resume+0xd81)[0x2b1a834b01 91] /opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/performance/io-threads.so( iot_worker+0x119)[0x2aaaab894229] /lib64/libpthread.so.0[0x30c160673d] /lib64/libc.so.6(clone+0x6d)[0x30c0ad44bd] --------- [2013-03-26 08:04:48.577056] W [socket.c:419:__socket_keepalive] 0-socket: failed to set keep idle on socket 8 [2013-03-26 08:04:48.613068] W [socket.c:1846:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported [2013-03-26 08:04:49.187484] W [graph.c:274:gf_add_cmdline_options] 0-sambavol-server: adding option 'listen-port' for volume 'sambavol-server' with value '24009' [2013-03-26 08:04:49.253395] W [rpc-transport.c:447:validate_volume_options] 0-tcp.sambavol-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction [2013-03-26 08:04:49.287651] E [posix.c:4369:init] 0-sambavol-posix: Directory '/raid5hs/glusterfs/export' doesn't exist, exiting. [2013-03-26 08:04:49.287709] E [xlator.c:1390:xlator_init] 0-sambavol-posix: Initialization of volume 'sambavol-posix' failed, review your volfile again [2013-03-26 08:04:49.287721] E [graph.c:331:glusterfs_graph_init] 0-sambavol-posix: initializing translator failed [2013-03-26 08:04:49.287731] E [graph.c:503:glusterfs_graph_activate] 0-graph: init failed [2013-03-26 08:04:49.287982] W [glusterfsd.c:700:cleanup_and_exit] (-->/opt/glusterfs/3.2.0/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa2) [0x2b38e21d63d2] (--: ----------------------------------------------- EDV Daniel M?ller Leitung EDV Tropenklinik Paul-Lechler-Krankenhaus Paul-Lechler-Str. 24 72076 T?bingen Tel.: 07071/206-463, Fax: 07071/206-499 eMail: mueller at tropenklinik.de Internet: www.tropenklinik.de ----------------------------------------------- -----Urspr?ngliche Nachricht----- Von: Pranith Kumar K [mailto:pkarampu at redhat.com] Gesendet: Donnerstag, 28. M?rz 2013 12:34 An: mueller at tropenklinik.de Cc: gluster-users at gluster.org; Reinhard Marstaller Betreff: Re: Glusterfs gives up with endpoint not connected On 03/28/2013 03:48 PM, Daniel M?ller wrote: > Dear all, > > Right out of the blue glusterfs is not working fine any more every now > end the it stops working telling me, Endpoint not connected and > writing core files: > > [root at tuepdc /]# file core.15288 > core.15288: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV), > SVR4-style, from 'glusterfs' > > My Version: > [root at tuepdc /]# glusterfs --version > glusterfs 3.2.0 built on Apr 22 2011 18:35:40 Repository revision: > v3.2.0 Copyright (c) 2006-2010 Gluster Inc. <http://www.gluster.com> > GlusterFS comes with ABSOLUTELY NO WARRANTY. > You may redistribute copies of GlusterFS under the terms of the GNU > Affero General Public License. > > My /var/log/glusterfs/bricks/raid5hs-glusterfs-export.log > > [2013-03-28 10:47:07.243980] I [server.c:438:server_rpc_notify] > 0-sambavol-server: disconnected connection from 192.168.130.199:1023 > [2013-03-28 10:47:07.244000] I > [server-helpers.c:783:server_connection_destroy] 0-sambavol-server: > destroyed connection of > tuepdc.local-16600-2013/03/28-09:32:28:258428-sambavol-client-0 > > > [root at tuepdc bricks]# gluster volume info > > Volume Name: sambavol > Type: Replicate > Status: Started > Number of Bricks: 2 > Transport-type: tcp > Bricks: > Brick1: 192.168.130.199:/raid5hs/glusterfs/export > Brick2: 192.168.130.200:/raid5hs/glusterfs/export > Options Reconfigured: > network.ping-timeout: 5 > performance.quick-read: on > > Gluster is running on ext3 raid5 HS on both hosts [root at tuepdc > bricks]# mdadm --detail /dev/md0 > /dev/md0: > Version : 0.90 > Creation Time : Wed May 11 10:08:30 2011 > Raid Level : raid5 > Array Size : 1953519872 (1863.02 GiB 2000.40 GB) > Used Dev Size : 976759936 (931.51 GiB 1000.20 GB) > Raid Devices : 3 > Total Devices : 4 > Preferred Minor : 0 > Persistence : Superblock is persistent > > Update Time : Thu Mar 28 11:13:21 2013 > State : clean > Active Devices : 3 > Working Devices : 4 > Failed Devices : 0 > Spare Devices : 1 > > Layout : left-symmetric > Chunk Size : 64K > > UUID : c484e093:018a2517:56e38f5e:1a216491 > Events : 0.250 > > Number Major Minor RaidDevice State > 0 8 49 0 active sync /dev/sdd1 > 1 8 65 1 active sync /dev/sde1 > 2 8 97 2 active sync /dev/sdg1 > > 3 8 81 - spare /dev/sdf1 > > [root at tuepdc glusterfs]# tail -f mnt-glusterfs.log > [2013-03-28 10:57:40.882566] I [rpc-clnt.c:1531:rpc_clnt_reconfig] > 0-sambavol-client-0: changing port to 24009 (from 0) > [2013-03-28 10:57:40.883636] I [rpc-clnt.c:1531:rpc_clnt_reconfig] > 0-sambavol-client-1: changing port to 24009 (from 0) > [2013-03-28 10:57:44.806649] I > [client-handshake.c:1080:select_server_supported_programs] > 0-sambavol-client-0: Using Program GlusterFS-3.1.0, Num (1298437), > Version > (310) > [2013-03-28 10:57:44.806857] I > [client-handshake.c:913:client_setvolume_cbk] > 0-sambavol-client-0: Connected to 192.168.130.199:24009, attached to > remote volume '/raid5hs/glusterfs/export'. > [2013-03-28 10:57:44.806876] I [afr-common.c:2514:afr_notify] > 0-sambavol-replicate-0: Subvolume 'sambavol-client-0' came back up; > going online. > [2013-03-28 10:57:44.811557] I [fuse-bridge.c:3316:fuse_graph_setup] 0-fuse: > switched to graph 0 > [2013-03-28 10:57:44.811773] I [fuse-bridge.c:2897:fuse_init] > 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 > kernel 7.10 > [2013-03-28 10:57:44.812139] I [afr-common.c:836:afr_fresh_lookup_cbk] > 0-sambavol-replicate-0: added root inode > [2013-03-28 10:57:44.812217] I > [client-handshake.c:1080:select_server_supported_programs] > 0-sambavol-client-1: Using Program GlusterFS-3.1.0, Num (1298437), > Version > (310) > [2013-03-28 10:57:44.812767] I > [client-handshake.c:913:client_setvolume_cbk] > 0-sambavol-client-1: Connected to 192.168.130.200:24009, attached to > remote volume '/raid5hs/glusterfs/export'. > > > > > How can I fix this issue!?? > > Daniel > > ----------------------------------------------- > EDV Daniel M?ller > > Leitung EDV > Tropenklinik Paul-Lechler-Krankenhaus > Paul-Lechler-Str. 24 > 72076 T?bingen > > Tel.: 07071/206-463, Fax: 07071/206-499 > eMail: mueller at tropenklinik.de > Internet: www.tropenklinik.de > ----------------------------------------------- > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users Could you paste the traceback that is printed in the log file of the mount for that crash. search for "crash" in the logs. You will see the trace after that. Paste that here. Pranith. _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users