This is -very- helpful. So, if I understand you properly, I should focus on scaling -inside- my EBS devices first. What I should really do is create a gluster volume that starts with -lots- of 125 gb EBS device (in my case, 32 to achieve 2 TB of usable replicated storage). I should rsync -a to this volume to ensure a roughly even distribution/replication of files. As the fullest EBS device gets to 80%, using snapshot/restore techniques, replace them with 250 gb EBS devices. Next time 500, Next time 1 tb. Then start over again with 512 125gb EBS devices and another rsync -a, and repeat. Because Gluster is a zero metadata system, this should in theory scale to the horizon, with a quick scriptable upgrade every doubling, and one painful multi-day transition using rsync -a every 10x. Does this make sense? What are the gotchas with this approach? Thanks for your insights on this! Gart On Mon, Oct 25, 2010 at 7:25 PM, Barry Jaspan <barry.jaspan at acquia.com> wrote: > Gart, > > I was speaking generally in my message because I did not know anything about > your actual situation (maybe because I did not read carefully). From this > message, I understand your goal to be: You have a "source EBS volume" that > you would like to replace with a gluster filesystem containing the same > data. Based on this, my personal recommendation (which carries no official > weight whatsoever) is: > > 1.? On your gluster fileservers, mount whatever bricks you want. It sounds > you want cluster/distribute over two cluster/replicate volumes over two 1TB > EBS volumes each, so put two 1TB bricks on each server and export them. > > 2. From the machine holding the source EBS volume, mount the gluster bricks > created in step 1 under a volfile that arranges them under > cluster/distribute and cluster/replicate as you wish. > > 3. rsync -a /source-ebs /mnt/gfs > > 4. Switch your production service to use /mnt/gfs. > > 5. rsync -a /source-ebs /mnt/gfs again to catch any stragglers. The actual > details of when/how to run rsync, whether to take down production, etc. > depend on your service, of course. > > On Mon, Oct 25, 2010 at 2:13 PM, Gart Davis <gdavis at spoonflower.com> wrote: >> >> My priincipal concerns with this relate to Barry's 3rd bullet: Gluster >> does not rebalance evenly, and so this solution will eventually bounce >> off the roof and lock up. > > We had a replicate volume. We added distribute on top of it, added a > subvolume (which was another replicate volume), and used gluster's > "rebalance" script which consists of removing certain extended attributes, > renaming files, and copying them back into place. The end result was that > not very much data got moved to the new volume. Also, that approach to > rebalancing has inherent race conditions. The best you can do to add more > storage space to an existing volume is to set your min-free-disk low enough > (perhaps 80%) so that each time a new file is added that should go to the > old full brick gluster will instead create a link file on the old brick > pointing to the new brick, and put the real data on the new brick. This > imposes extra link-following overhead, but I believe it works. > >> Forgive my naivete Barry, when you say 'just use larger replicate >> volumes instead of distribute', what does that mean? > > After our fiasco trying to switch from a single replicate volume to > distribute over two replicates (having all the problems I just described), > we just went back to a single replicate volume, and increased our EBS volume > sizes. They were only 100GB, and we made them 500GB. This worked because EBS > allows it. If/when we need the bricks to be bigger than 1TB... well I hope > gluster has improved its capabilities by that point.? If not, we might use > lvm or whatever on the glusterfs server to make multple ebs volumes look > like >1TB bricks. > > Barry > >> >> ?Are you running >> multiple 1 tb EBS bricks in a single 'replica 2' volume under a single >> file server? ?My recipe is largely riffing off Josh's tutorial. >> You've clearly found a recipe that you're happy to entrust production >> data to... how would you change this? >> >> Thanks! >> >> Gart >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > > > -- > Barry Jaspan > Senior Architect | Acquia > barry.jaspan at acquia.com | (c) 617.905.2208 | (w) 978.296.5231 > > "Get a free, hosted Drupal 7 site: http://www.drupalgardens.com" > >