On Thu, Jan 10, 2013 at 12:50:48PM -0500, Liang Ma wrote: > I assume to replace a failed replicate disk or node should be a > standard procedure, isn't it? I could find anything related to this in > the 3.3 manual. You'd have thought so, wouldn't you :-( I know of two options. (1) If the server itself is OK, but the brick filesystem you're exporting from that server has died, then just stop glusterd, erase (mkfs) the filesystem which the brick is on, remount it, restart glusterd. After a few minutes, self-heal will kick in and copy the data for replicated volumes. At least, it did in a test setup I tried once. This of course assumes your data filesystem is separate from your OS filesystem, which I'd suggest is a good idea anyway. (2) If the whole server has died, or you have to re-install the OS from scratch, but the replacement server has the same hostname as the old one, then there's a different procedure. It was documented at http://gluster.org/community/documentation/index.php/Gluster_3.2:_Brick_Restoration_-_Replace_Crashed_Server for glusterfs 3.2. It is almost the same for glusterfs 3.3, but the config directory has moved. It works if you change these two steps: grep server3 /var/lib/glusterd/peers/* echo UUID=... >/var/lib/glusterd/glusterd.info HTH, Brian.