Hi again Are rename(2) operation supposed to survive the death of a brick? I tried simulating an outage by unmounting the filesystem for a brick, and on the client, a tar(1) which is a heavy reame(2) user compalains a lot: tar: Cannot rename usr/src/crypto/dist/ipsec-tools/src/racoon/rfc/draft-ietf-ipsec-nat-t-ike-04.txt.29982d to usr/src/crypto/dist/ipsec-tools/src/racoon/rfc/draft-ietf-ipsec-nat-t-ike-04.txt (No such file or directory) I reformatted the unmounted filesystem and remounted it with the intent to have it rebuilt, but self heals does not start. tar(1) carry on complaining. Restarting glusterd/glusterfsd on the server does not help. the gluster volume replace-brick comand will refuse to work when old andnew bricks are the same. The, I do not know if it is related, but client crashes: Program terminated with signal 11, Segmentation fault. #0 0xba3925f7 in afr_readdirp_cbk (frame=0xbaf022d0, cookie=0x1, this=0xbb9c5000, op_ret=1, op_errno=2, entries=0xbfbfe088) at afr-dir-read.c:592 592 if ((local->fd->inode == local->fd->inode->table->root) (gdb) bt #0 0xba3925f7 in afr_readdirp_cbk (frame=0xbaf022d0, cookie=0x1, this=0xbb9c5000, op_ret=1, op_errno=2, entries=0xbfbfe088) at afr-dir-read.c:592 #1 0xba3e6688 in client3_1_readdirp_cbk (req=0xb980174c, iov=0xb980176c, count=1, myframe=0xbaf02dc0) at client3_1-fops.c:1939 #2 0xbbb8a586 in rpc_clnt_handle_reply (clnt=0xbb962480, pollin=0xb4733ac0) at rpc-clnt.c:736 #3 0xbbb8a773 in rpc_clnt_notify (trans=0xbb9a2000, mydata=0xbb9624a0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x0) at rpc-clnt.c:849 #4 0xbbb8589d in rpc_transport_notify (this=0xaaaaaaaa, event=RPC_TRANSPORT_MSG_RECEIVED, data=0xb4733ac0) at rpc-transport.c:918 #5 0xbba21c8f in socket_event_poll_in (this=0xbb9a2000) at socket.c:1647 #6 0xbba21e5b in socket_event_handler (fd=17, idx=5, data=0xbb9a2000, poll_in=1, poll_out=0, poll_err=0) at socket.c:1762 #7 0xbbbc0e4a in event_dispatch_poll (event_pool=0xbb90e0e0) at event.c:366 #8 0xbbbc099b in event_dispatch (event_pool=0x0) at event.c:956 #9 0x0804cd91 in main (argc=5, argv=0xbfbfe8c4) at glusterfsd.c:1503 (gdb) print local->fd->inode $1 = (struct _inode *) 0xaaaaaaaa (gdb) x/16w local->fd 0xb8c0107c: 0x00001808 0x00000000 0x00000003 0xb8c01088 0xb8c0108c: 0xb8c01088 0xaaaaaaaa 0xdead0007 0x00000000 0xb8c0109c: 0x00000000 0xbb909640 0x00000014 0xbabebabe 0xb8c010ac: 0xcafecafe 0x00000001 0x0000751e 0x00000000 Once the client is remounted, self-heal is correctly triggered on server and everything is fixed. -- Emmanuel Dreyfus manu@xxxxxxxxxx