After painful experience, we have found that the only way to do what you are trying is to add new, empty volumes to glusterfs and let it self-heal the files onto them with find /mnt/gfs -ls. Neither starting with a snapshot nor rsync'ing (even with rsync 3's -X option) to "speed up" the process helps, and generally just ends up making a mess. Additionally, for it to work reliably, you need to shut off any large production load on the system... which for 1TB volumes on EBS is unfortunately going to mean a lot of downtime for you. Scenarios in which we've had to use this blank-volume approach: * Recovering when one of the EBS volumes in a replica pair fails; we start with a blank volume and self-heal from the remaining volume. * Adding a new volume to a distribute volume so as to distribute over more disk space. In this case you have to start by adding a blank volume and then also go through the "re-balancing" process... and note that glusterfs is not always very balanced in the way it distributes files across distribute subvolumes. * Removing a volume from a distribute volume. In this case we were running distribute on top of the replicate and decided to just use larger replicate volumes instead of distribute (b/c of the uneven way distribute was distributing, we were going to run out of space in the glusterfs volume even though one of the underlying bricks was nearly empty, even after "re-balancing). Barry On Mon, Oct 25, 2010 at 7:52 AM, Joshua Saayman <joshua at saayman.me> wrote: > Another GlusterFS 3.1 question on my blog > (http://cloudarchitect.posterous.com). Any help/ideas will be > appreciated. > > Thanks > Joshua > > ---- > > Here's my challenge: I have several 1 tb ebs volumes now that are > un-replicated and reaching capacity. I'm trying to suss out the most > efficient way to get each one of these into its own replicated 4 tb > gluster fs. > > My hope was that I could snapshot each one, restore it twice from the > snapshot, and launch this pair as a pre-replicated gluster FS where > the 'heal' process (find . | xargs stat) allows the gluster daemon to > rationalize the situation, and then add a second pair of empty bricks > - to grow on. As you know, all this can be done in just a few minutes. > > Well, I have now tried this, and I'm afraid I've got a goopy mess... > so much for -that- shortcut. Rather than try to debug the situation, > I'm curious if there is a better high speed import strategy? A dd for > gluster? Any thoughts? > > If all else fails, I'm happy to create the naked file system and just > do an rsync, but last time I did this (when I was testing out LVM) the > rsync took 3 days. In general, I'm thinking about this exercise not > just as a migration, but also as a test of emergency restore, and a 3 > day emergency restore is an awful long time. > > And one more time, thanks for this informative series. I'm curious > where you go for your info (other than senior engineers at > gluster!)... This topic - gluster on EBS - seems remarkably sparsely > covered for all its massive applicability to cloud infrastructure... > is there a gluster on ebs group somewhere? Its not as though either > technology is brand new.... > > Regards, > Gart > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > -- Barry Jaspan Senior Architect | Acquia <http://acquia.com> barry.jaspan at acquia.com | (c) 617.905.2208 | (w) 978.296.5231 "Get a free, hosted Drupal 7 site: http://www.drupalgardens.com"