Hi group, I'm in production with gluster for the last 2 weeks. No problems until today. As of today I've got the "Transport endpoint is not connected" problem on the client, maybe once every hour. df: `/services/users/6': Transport endpoint is not connected Here is my setup: I have 1 Client and 2 Servers with 2 Disks each for bricks. Glusterfs 3.3 compiled from source. # gluster volume info Volume Name: freecloud Type: Distributed-Replicate Volume ID: 1cf4804f-12aa-4cd1-a892-cec69fc2cf22 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: XX.25.137.252:/mnt/35be42b4-afb3-48a2-8b3c-17a422fd1e15 Brick2: YY.40.3.216:/mnt/7ee4f117-8aee-4cae-b08c-5e441b703886 Brick3: XX.25.137.252:/mnt/9ee7c816-085d-4c5c-9276-fd3dadac6c72 Brick4: YY.40.3.216:/mnt/311399bc-4d55-445d-8480-286c56cf493e Options Reconfigured: cluster.self-heal-daemon: on performance.cache-size: 256MB performance.io-thread-count: 32 features.quota: on Quota is ON but not used --------------------------------------------- # gluster volume status all detail Status of volume: freecloud ------------------------------------------------------------------------------ Brick : Brick XX.25.137.252:/mnt/35be42b4-afb3-48a2-8b3c-17a422fd1e15 Port : 24009 Online : Y Pid : 29221 File System : xfs Device : /dev/sdd1 Mount Options : rw Inode Size : 256 Disk Space Free : 659.7GB Total Disk Space : 698.3GB Inode Count : 732571968 Free Inodes : 730418928 ------------------------------------------------------------------------------ Brick : Brick YY.40.3.216:/mnt/7ee4f117-8aee-4cae-b08c-5e441b703886 Port : 24009 Online : Y Pid : 15496 File System : xfs Device : /dev/sdc1 Mount Options : rw Inode Size : 256 Disk Space Free : 659.7GB Total Disk Space : 698.3GB Inode Count : 732571968 Free Inodes : 730410396 ------------------------------------------------------------------------------ Brick : Brick XX.25.137.252:/mnt/9ee7c816-085d-4c5c-9276-fd3dadac6c72 Port : 24010 Online : Y Pid : 29227 File System : xfs Device : /dev/sdc1 Mount Options : rw Inode Size : 256 Disk Space Free : 659.9GB Total Disk Space : 698.3GB Inode Count : 732571968 Free Inodes : 730417864 ------------------------------------------------------------------------------ Brick : Brick YY.40.3.216:/mnt/311399bc-4d55-445d-8480-286c56cf493e Port : 24010 Online : Y Pid : 15502 File System : xfs Device : /dev/sdb1 Mount Options : rw Inode Size : 256 Disk Space Free : 659.9GB Total Disk Space : 698.3GB Inode Count : 732571968 Free Inodes : 730409337 On server1 I mount the volume and start copying files to it. Server1 is used like storage. 209.25.137.252:freecloud 1.4T 78G 1.3T 6% /home/freecloud One thing to mention is that I have a large list of subdirectories in the main directory and the list keeps getting bigger. client1# ls | wc -l 42424 --------------------------------------- I have one client server that mounts glusterfs and uses the files directly as the files are for low traffic web sites. On the client, there is no gluster daemon, just the mount. client1# mount -t glusterfs rscloud1.domain.net:/freecloud /services/users/6/ This all worked fine for the last 2-3 weeks. Here is a log from the crash client1:/var/log/glusterfs/services-users-6-.log pending frames: frame : type(1) op(RENAME) frame : type(1) op(RENAME) frame : type(1) op(RENAME) frame : type(1) op(RENAME) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2012-07-12 14:51:01 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.3.0 /lib/x86_64-linux-gnu/libc.so.6(+0x32480)[0x7f1e0e9f0480] /services/glusterfs//lib/libglusterfs.so.0(uuid_unpack+0x0)[0x7f1e0f79d760] /services/glusterfs//lib/libglusterfs.so.0(+0x4c526)[0x7f1e0f79d526] /services/glusterfs//lib/libglusterfs.so.0(uuid_utoa+0x26)[0x7f1e0f77ca66] /services/glusterfs//lib/glusterfs/3.3.0/xlator/features/quota.so(quota_rename_cbk+0x308)[0x7f1e09b940c8] /services/glusterfs//lib/glusterfs/3.3.0/xlator/cluster/distribute.so(dht_rename_unlink_cbk+0x454)[0x7f1e09dad264] /services/glusterfs//lib/glusterfs/3.3.0/xlator/cluster/replicate.so(afr_unlink_unwind+0xf7)[0x7f1e09ff23c7] /services/glusterfs//lib/glusterfs/3.3.0/xlator/cluster/replicate.so(afr_unlink_wind_cbk+0xb6)[0x7f1e09ff43d6] /services/glusterfs//lib/glusterfs/3.3.0/xlator/protocol/client.so(client3_1_unlink_cbk+0x526)[0x7f1e0a2792c6] /services/glusterfs//lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f1e0f548e45] /services/glusterfs//lib/libgfrpc.so.0(rpc_clnt_notify+0xa5)[0x7f1e0f549825] /services/glusterfs//lib/libgfrpc.so.0(rpc_transport_notify+0x27)[0x7f1e0f5458b7] /services/glusterfs//lib/glusterfs/3.3.0/rpc-transport/socket.so(socket_event_poll_in+0x34)[0x7f1e0b84de14] /services/glusterfs//lib/glusterfs/3.3.0/rpc-transport/socket.so(socket_event_handler+0xc7)[0x7f1e0b84e167] /services/glusterfs//lib/libglusterfs.so.0(+0x40047)[0x7f1e0f791047] /services/glusterfs/sbin/glusterfs(main+0x34d)[0x404c5d] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x7f1e0e9dcead] /services/glusterfs/sbin/glusterfs[0x404f21] --------- Where should I look for more clues?