Hi gluster users, I just upgraded 3.2.5 to 3.3.1 for a Distributed-Replicate volume with about 2M directories to get a working replace-brick and now see it hang up the entire gluster volume for all clients for several minutes, and subsequently hang up the glusterfs on the destination brick. I suspect the gluster volume hangup to be related to https://bugzilla.redhat.com/show_bug.cgi?id=832609 "Glusterfsd hangs if brick filesystem becomes unresponsive, causing all clients to lock up". The resulting hanging destination replace-brick sits at 100% CPU and shows no strace output. gluster volume replace-brick xxx status Number of files migrated = 3 Current file= /xxx %CPU %MEM TIME+ P COMMAND 100 0.2 2238:48 2 //sbin/glusterfs -f/var/lib/glusterd/vols/vol01/rb_dst_brick.vol ... The target brick received about 1% of the intended directories. The log file -etc-glusterfs-glusterd.vol.log shows only that the replace-brick has started : I [glusterd-replace-brick.c:98:glusterd_handle_replace_brick] 0-glusterd: Received replace brick req I [glusterd-replace-brick.c:147:glusterd_handle_replace_brick] 0-glusterd: Received replace brick status request I [glusterd-utils.c:285:glusterd_lock] 0-glusterd: Cluster lock held by 3* I [glusterd-handler.c:463:glusterd_op_txn_begin] 0-management: Acquired local lock I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: 9* I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: c* I [glusterd-utils.c:857:glusterd_volume_brickinfo_get_by_brick] 0-: brick: s1:/g/c I [glusterd-utils.c:814:glusterd_volume_brickinfo_get] 0-management: Found brick I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 2 peers I [glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: c* I [glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 9* I [glusterd-utils.c:857:glusterd_volume_brickinfo_get_by_brick] 0-: brick: s1:/g/c I [glusterd-utils.c:814:glusterd_volume_brickinfo_get] 0-management: Found brick I [glusterd-replace-brick.c:1288:rb_update_dstbrick_port] 0-: adding dst-brick port no I [glusterd-op-sm.c:2384:glusterd_op_ac_send_commit_op] 0-management: Sent op req to 2 peers I [glusterd-rpc-ops.c:1317:glusterd3_1_commit_op_cbk] 0-glusterd: Received ACC from uuid: c* I [glusterd-rpc-ops.c:1317:glusterd3_1_commit_op_cbk] 0-glusterd: Received ACC from uuid: 9* I [glusterd-rpc-ops.c:607:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: 9* I [glusterd-rpc-ops.c:607:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received ACC from uuid: c* I [glusterd-op-sm.c:2653:glusterd_op_txn_complete] 0-glusterd: Cleared local lock Any hints on how to proceed from here and get replace-brick to work are welcome. regards, Hans Lambermont -- Hans Lambermont | Senior Architect (t) +31407370104 (w) www.shapeways.com