On Tue, May 8, 2012 at 9:46 PM, 任强 <renqiang@xxxxxxxxxx> wrote: > Dear All: > > I have a question. When I have a large cluster, maybe more than 10PB data, > if a file have 3 copies and each disk have 1TB capacity, So we need about > 30,000 disks. All disks are very cheap and are easily damaged. We must > repair a 1TB disk in 30 mins。As far as I know,in gluster architecture,all > data in the damaged disk will be repaired to the new disk which is used to > replace the damaged disk. As a result of the writing speed of disk, when we > repair 1TB disk in gluster, we need more than 5 hours. Can we do it in 30 > mins? 5 hours is based on SATA 1TB disk copying at ~50MB/s across small and large files + folders. This means, you literally attached the disk to the system and manually transferring the data. I can't think of any other faster way to transfer data on 1TB 7200RPM SATA/SAS disks without bending space-time ;). Larger disks and RAID arrays only makes this worse. This is exactly why we implemented passive self-heal in the first place. GlusterFS heals files on demand (as they are accessed), so applications have least down time or disruption. There is plenty of time to heal the cold data in background. All we should care is minimal down time. Self-heal in 3.3 has some major improvements. It got significantly faster, because healing is performed on the server side entirely (server to server). It can perform granular healing on large files (previously checksum operations used to pause or timeout the VMs). Active-healing (Replicate now remembers pending files and heals them when the failed node comes back. Previously you have to perform name-space wide recursive directory listing). Most importantly self-healing is no longer a blackbox. heal-info can show pending and currently-healing files. -- Anand Babu Periasamy Blog [http://www.unlocksmith.org] Imagination is more important than knowledge --Albert Einstein