Re: Ideal hardware spec?

Mark Nelson <mark.nelson@xxxxxxxxxxx> · Fri, 24 Aug 2012 13:23:30 -0500

On 08/24/2012 01:12 PM, Wido den Hollander wrote:

On 08/24/2012 05:05 PM, Mark Nelson wrote:

I'm running Atom D525 (SuperMicro X7SPA-HF) nodes with 4GB of RAM and
4 2TB
disks and a 80GB SSD (old X25-M) for journaling.

That works, but what I notice is that under heavy recover the Atoms
can't
cope with it.

I'm thinking about building a couple of nodes with the AMD Brazos
mainboard, somelike like an Asus E35M1-I.

That is not a serverboard, but it would just be a reference to see
what it
does.

One of the problems with the Atoms is the 4GB memory limitation, with
the
AMD Brazos you can use 8GB.

I'm trying to figure out a way to have a really large amount of small
nodes
for a low price to have
a massive cluster where the impact of loosing one node is very small.

Given that "massive" is a relative term, I am as well... but I'm also
trying
to reduce the footprint (power and space) of that "massive" cluster.
I also
want to start small (1/2 rack) and scale as needed.

If you do end up testing Brazos processes, please post your results! I
think it really depends on what kind of performance you are aiming for.
Our stock 2U test boxes have 6-core opterons, and our SC847a has dual
6-core low power Xeon E5s. At 10GbE+ these are probably going to be
pushed pretty hard, especially during recovery.

I'm aiming for a Ceph cluster of a couple of hundred TB consisting out
of 5 or 6 racks full of 1U machines with each 4x 1TB.

Having about ~200 of these nodes all doing not that much work.

If one fails I'd loose 0.5% of my cluster and recovery shouldn't be that
hard. Assuming here that the node crashes due to hardware failure, not
being plagued by some Ceph or BTRFS bug cluster-wide :)

Wido

Just based on past experience, I figure the most common causes of 
failure are going to be drive "failure", and controller failure.  Your 
solution mitigates that by just going with tons of 1U nodes with few 
drives.  I'm hoping we can also mitigate it by skipping expanders and 
doing no more than 8 drives per controller.  It does mean you top out at 
like 40-48 drives per node max on most server boards.

Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html