> "Larry Bates" <larry.bates at vitalesafe.com> wrote on 01/24/2012 08:34:03 AM: > > I'll admit to not understanding your response and would really > > appreciate a little more explanation. I only have two servers > > with 8 x 2TB each in AFR-DHT so far, but we are growing and will > > continue to do so basically forever. I'm interested in experience of people using this model as well, preferably on larger systems. How do you find gluster handles individual drive failures? Is it possible to mark a single disk/brick as down without downing its replica? Do you need to keep spare drive slots in the chassis so that you can replace-brick <dud> <new> onto another drive in the same chassis? In fact, does replace-brick <dud> <new> even work if <dud> has died? If you have a whole bunch of bricks sharing the same underlying disk (e.g. /disk1/foo, /disk1/bar, /disk1/baz), then presumably you need to remember to replace-brick every one onto the new drive? I found http://gluster.org/community/documentation/index.php/Gluster_3.2:_Brick_Restoration_-_Replace_Crashed_Server but this is about a whole server failing, not a single brick within one server. Clearly there is a glusterd uuid, but I'm not sure if each brick also has a uuid. Unfortunately I can't find any information about handling individual brick failures in the General FAQ or the Technical FAQ either. ISTM that if you were relying on this as your sole method for handling drive failures, which are bound to happen from time to time, you'd need to be well-drilled in the procedure. Many thanks, Brian.