On Jan 8, 2009, at 5:23 AM, Joe Landman wrote:
Dan Parsons wrote:
Now that I'm upgrading to gluster 1.4/2.0, I'm going to take the time
and rearchitect things.
Hardware: Gluster servers: 4 blades connected via 4gbit fc to fast,
dedicated storage. Each server has two bonded Gig-E links to the rest
of my network, for 8gbit/s theoretical throughput.
Just make sure the channel bonded gigabits are a) not broadcom
based, b) not using anything other than mode 0 (long story, but me
offline if you want to hear some horror stories of hard to fix
crashes). If you have access to 10 GbE/IB, these would be superior
solutions (entire system ... storage and clients).
I've actually been using this stripe + bonding solution for 8 months
with no problems (that are related to those two technologies), even
though I'm using broadcom chips (forced to by what's on our blades).
Not only that, but the blade chassis switch is a Dell
Powerconnect.......... I would have much preferred a Cisco but the
replacement cost is kind of high and I did eventually make the
powerconnect do the bonding properly. Using mode 802.3ad, which I
believe is also known as mode 4. As I said, it's worked great for 8
months. With iptraf I can saturate the 2gbit/s link with very little
impact on system performance, and irq generation is reasonable too.
I'd rather have Intel but I can't say the Broadcom stuff is causing
any problems for me.
Gluster clients: 33 blades each with one, gig-e connection. They use
local storage for OS and gluster for input/output files.
Specific questions: (1) There are many times, in our workflow, when
more than a few nodes will want the same file at the same time. This
made me want to use the stripe xlator. In this way, when a client
node saturates its gig-e link reading the file, each gluster server
is using only 250mbit/s, leaving room for more clients. If I wasn't
using stripe, this hypothetical file would be on just one server
node, and it would get slammed if more than two client nodes talked
to it. Is there a better way of doing this? Did I make the correct
decision in using stripe xlator for this purpose? Can I achieve the
same thing using just afr?
Without spending money to fix the storage architecture, you really
will need to look at afr, as stripe may help on single requests more
than multiple (guessing). You should be able to benchmark/test
this, but I would imagine that AFR would help you with multiple
simultaneous read/only access to specific files. Read/write will be
more complex.
If you can spend money to fix the storage architecture, 10GbE or IB
everywhere (storage nodes, client nodes, ...). You won't regret it.
stripe has worked flawlessly for me, helping enormously when 33 nodes
each want the same pile of multi-gigabyte files at the same time. I
asked the question I did to find out if there was a better way of
doing this with gluster 2.0. The upgrade cost for IB was almost twice
my department's annual budget. And I believe from what Krishna said,
afr-based load balancing does not yet exist in 2.0. Again, no problems
with this setup, just making sure I'm doing it the best way (with
regards to gluster).
(2) I would like to architect the system such that if one node goes
down, the others can keep serving the data, even if overall
throughput is less. This means that all data would need to be
accessible from all clients. Is this something I would use afr xlator
for? If so, do I even need stripe anymore, to handle my need to have
Server side AFR. Stripe may not help the reliability here.
ZOMG - please point me at docs on how to set this up. stripe on top of
AFR sounds nice but I do believe that will make all my client nodes
have to do more work, right? So having the servers handle AFR would be
beautiful.
multiple servers capable of sending different chunks of the same
file? And how does the HA xlator play into this?
We have a mix of (small quantity of gigantic files) and (extremely
gigantic quantity of small files), so I'm sure there will need to be
some parameter tuning.
Thanks in advance. If this question would be better addressed under
some sort of support agreement, please let me know.
Dan Parsons
------------------------------------------------------------------------
_______________________________________________ Gluster-devel mailing
list Gluster-devel@xxxxxxxxxx http://lists.nongnu.org/mailman/listinfo/gluster-devel
Finally, the only remaining "issue" I have with gluster is the memory
leak in the io-cache translator, which I'm told is fixed in 2.0, which
is why I'm upgrading.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman@xxxxxxxxxxxxxxxxxxxxxxx
web : http://www.scalableinformatics.com
http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615