Re: Ideal hardware spec?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/24/2012 01:12 PM, Wido den Hollander wrote:


On 08/24/2012 05:05 PM, Mark Nelson wrote:

I'm running Atom D525 (SuperMicro X7SPA-HF) nodes with 4GB of RAM and
4 2TB
disks and a 80GB SSD (old X25-M) for journaling.

That works, but what I notice is that under heavy recover the Atoms
can't
cope with it.

I'm thinking about building a couple of nodes with the AMD Brazos
mainboard, somelike like an Asus E35M1-I.

That is not a serverboard, but it would just be a reference to see
what it
does.

One of the problems with the Atoms is the 4GB memory limitation, with
the
AMD Brazos you can use 8GB.

I'm trying to figure out a way to have a really large amount of small
nodes
for a low price to have
a massive cluster where the impact of loosing one node is very small.

Given that "massive" is a relative term, I am as well... but I'm also
trying
to reduce the footprint (power and space) of that "massive" cluster.
I also
want to start small (1/2 rack) and scale as needed.

If you do end up testing Brazos processes, please post your results! I
think it really depends on what kind of performance you are aiming for.
Our stock 2U test boxes have 6-core opterons, and our SC847a has dual
6-core low power Xeon E5s. At 10GbE+ these are probably going to be
pushed pretty hard, especially during recovery.


I'm aiming for a Ceph cluster of a couple of hundred TB consisting out
of 5 or 6 racks full of 1U machines with each 4x 1TB.

Having about ~200 of these nodes all doing not that much work.

If one fails I'd loose 0.5% of my cluster and recovery shouldn't be that
hard. Assuming here that the node crashes due to hardware failure, not
being plagued by some Ceph or BTRFS bug cluster-wide :)

Wido

Just based on past experience, I figure the most common causes of failure are going to be drive "failure", and controller failure. Your solution mitigates that by just going with tons of 1U nodes with few drives. I'm hoping we can also mitigate it by skipping expanders and doing no more than 8 drives per controller. It does mean you top out at like 40-48 drives per node max on most server boards.

Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux