Re: Cluster sizes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/13/2013 09:19 PM, Jeff Darcy wrote:
On 11/13/2013 10:19 AM, Vijay Bellur wrote:
Makes me wonder what would be a typical deployment scenario - would we
have a single volume that spans around 10K nodes? If yes, what are the
scalability problems that we foresee? DHT's directory spread is on the
top of my mind. Would the directory spread count option be good enough
to address this?

The idea of a single volume spanning 10K nodes kind of freaks me out,
but if we support that many nodes (in this kind of scenario they're
likely to be both servers and clients) then it's almost inevitable that
we'll have users who try to create volumes across all of them.  After
all, that's the "unified namespace" value prop, right?

Yes, the "single namespace" across multiple servers is something that we will most likely run into.


I don't think directory spread count is sufficient to address this.  At
that level, we can *never* do anything that involves hitting all bricks.
  That includes getxattr to fetch layouts (even if most of them or empty
for a particular directory), it includes mkdir, and so on.  We'll have
to do *everything* via consistent hashing, including the things where we
currently rely on information being global.

The more I think about this, I rue the fact that we don't have an external metadata server ;-).


Even having that many
connections is going to be a serious problem, so we'll probably have to
do some pooling or proxying or something.  Tracking and coordinating
rebalance state is going to be another problem, so we'll probably need a
fundamentally different approach there as well.

Yeah, imagining those many connections is also quite painful. I think we need to carefully think through how we can reach this scale.


But first, we have to solve the glusterd scaling issues.  Any scaling in
DHT or elsewhere in the I/O plane won't even matter until the management
plane can support building a cluster that large.


+1. More data plane issues will become tangible after we have the necessary management plane infrastructure.

-Vijay




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux