Hi, yesterday I’ve got a strange crash on almost all bricks, same type of crash, repeated: [2015-06-09 18:23:56.407520] I [login.c:81:gf_auth] 0-auth/login: allowed user names: c3deedb5-893f-41fb-8c33-9ae23a0e9d27 [2015-06-09 18:23:56.407580] I [server-handshake.c:585:server_setvolume] 0-atlas-data-01-server: accepted client from atlas-storage-10.roma1.infn.it-7546-2015/06/09-18:23:55:618600-atlas-data-01-client-0-0-0 (version: 3.7.1) [2015-06-09 18:23:56.407707] I [login.c:81:gf_auth] 0-auth/login: allowed user names: c3deedb5-893f-41fb-8c33-9ae23a0e9d27 [2015-06-09 18:23:56.407772] I [server-handshake.c:585:server_setvolume] 0-atlas-data-01-server: accepted client from atlas-storage-09.roma1.infn.it-25429-2015/06/09-18:18:57:328935-atlas-data-01-client-0-0-0 (version: 3.7.1) [2015-06-09 18:23:56.415905] I [login.c:81:gf_auth] 0-auth/login: allowed user names: c3deedb5-893f-41fb-8c33-9ae23a0e9d27 [2015-06-09 18:23:56.415947] I [server-handshake.c:585:server_setvolume] 0-atlas-data-01-server: accepted client from atlas-storage-10.roma1.infn.it-7530-2015/06/09-18:23:54:608880-atlas-data-01-client-0-0-0 (version: 3.7.1) [2015-06-09 18:23:56.433956] E [posix-handle.c:157:posix_make_ancestryfromgfid] 0-atlas-data-01-posix: could not read the link from the gfid handle /bricks/atlas/data01/data/.glusterfs/74/4b/744b7cf0-258f-4dea-b4d9-7001bb21ca56 (No such file or directory) [2015-06-09 18:23:56.433954] E [posix-handle.c:157:posix_make_ancestryfromgfid] 0-atlas-data-01-posix: could not read the link from the gfid handle /bricks/atlas/data01/data/.glusterfs/74/4b/744b7cf0-258f-4dea-b4d9-7001bb21ca56 (No such file or directory) pending frames: frame : type(0) op(11) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2015-06-09 18:23:56 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.7.1 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f0f6446ed92] /lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7f0f644899ed] /lib64/libc.so.6(+0x35650)[0x7f0f62e60650] /usr/lib64/glusterfs/3.7.1/xlator/features/upcall.so(upcall_cache_invalidate+0xb5)[0x7f0f5537cab5] /usr/lib64/glusterfs/3.7.1/xlator/features/upcall.so(up_readdir_cbk+0x1a2)[0x7f0f55376292] /usr/lib64/glusterfs/3.7.1/xlator/features/locks.so(pl_readdirp_cbk+0x164)[0x7f0f5558dc94] /usr/lib64/glusterfs/3.7.1/xlator/features/access-control.so(posix_acl_readdirp_cbk+0x299)[0x7f0f557a6829] /usr/lib64/glusterfs/3.7.1/xlator/features/bitrot-stub.so(br_stub_readdirp_cbk+0x181)[0x7f0f559b5fb1] /usr/lib64/glusterfs/3.7.1/xlator/storage/posix.so(posix_readdirp+0x143)[0x7f0f56f0cfc3] /lib64/libglusterfs.so.0(default_readdirp+0x75)[0x7f0f644736a5] /lib64/libglusterfs.so.0(default_readdirp+0x75)[0x7f0f644736a5] /lib64/libglusterfs.so.0(default_readdirp+0x75)[0x7f0f644736a5] /usr/lib64/glusterfs/3.7.1/xlator/features/bitrot-stub.so(br_stub_readdirp+0x246)[0x7f0f559b0d46] /usr/lib64/glusterfs/3.7.1/xlator/features/access-control.so(posix_acl_readdirp+0x18d)[0x7f0f557a45cd] /usr/lib64/glusterfs/3.7.1/xlator/features/locks.so(pl_readdirp+0x14e)[0x7f0f5558c7ee] /usr/lib64/glusterfs/3.7.1/xlator/features/upcall.so(up_readdirp+0x17a)[0x7f0f5537abfa] /lib64/libglusterfs.so.0(default_readdirp_resume+0x134)[0x7f0f644809e4] /lib64/libglusterfs.so.0(call_resume+0x7d)[0x7f0f64498c7d] /usr/lib64/glusterfs/3.7.1/xlator/performance/io-threads.so(iot_worker+0x123)[0x7f0f5516b353] /lib64/libpthread.so.0(+0x7df5)[0x7f0f635dadf5] /lib64/libc.so.6(clone+0x6d)[0x7f0f62f211ad] --------- I’m not sure if the missing file is the culprit, but if it is the cause, how can I solve it? For the moment I’ve recreated the bricks from a backup, so I’m fine, but it would be nice to understand what to do in case it happens again. I still have the contents of the old crashed brick. The crash was happening every time I restarted glusterd, in the same way. I’m using gluster 3.7.1 on CentOS 7.1, with the following kind of configuration: # gluster volume info atlas-data-01 Volume Name: atlas-data-01 Type: Replicate Volume ID: 854620a1-3e88-4e76-91ce-486996bf6a12 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: node1:/bricks/atlas/data01/data Brick2: node2:/bricks/atlas/data01/data Brick3: node3:/bricks/atlas/data02/data Options Reconfigured: features.inode-quota: on features.quota: on performance.readdir-ahead: on nfs.disable: true server.allow-insecure: on ganesha.enable: off nfs-ganesha: disable I was playing with ganesha and tried to enable in on the volumes (but failed, as you can see from my other messages), and I’m not sure it is related, but all the crashed bricks were the ones belonging to the volumes where I tried to enable ganesha. Thanks, Alessandro
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users