Re: Disk/Pool Layout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I already had the SC3510... so I would stick with those :S...the SC 3700 were a little bit high on price and the manager didn't approved that... not my decision...unfortunately.

Also I see very hard to add at the moment another node,.. so maybe I can start with 6 OSD instead of 8, and leave those 2 extra OSD there, and in a couple of months buy a new OSD and add the three OSD at a time.

I've the nodes installed in two separate racks, with two different circuits and PDUs. I would need some info on CRUSH tuning so I could start building the solution, anyone?

Thanks a lot!

Best regards,


German

2015-08-27 13:40 GMT-03:00 Jan Schermer <jan@xxxxxxxxxxx>:
In that case you need fair IOPS and high throughput. Go with S3610 or the Samsungs (or something else that people there can recommend, but for the love of god don't save on drives :)). It's easier to stick to one type of drives and not complicate things.

I would also recommend you add one storage node and make the CRUSH with intermediate level for groups of 3 nodes. Makes maintenance much easier and the data more durable in case of failure. (Best to put the nodes in separate cages on different UPSes, then you can do stuff like disable barriers if you go with some cheaper drives that need it.) I'm not a CRUSH expert, there are more tricks to do before you set this up.

Jan

On 27 Aug 2015, at 18:31, German Anders <ganders@xxxxxxxxxxxx> wrote:

Hi Jan,

   Thanks for responding the email, regarding the cluster usage, we are going to used it for non-relational databases, Cassandra, mongoDBs and other apps, so we need that this cluster response well to intense io apps, it's going to be connected to HP enclosures with IB FDR also, and mapped through Cinder to mount it on VMs (KVM hypervisor), then on the vms we are going to run the non-relational dbs.

Thanks in advance,


German

2015-08-27 13:25 GMT-03:00 Jan Schermer <jan@xxxxxxxxxxx>:
Some comments inline.
A lot of it depends on your workload, but I'd say you almost certainly need higher-grade SSDs. You can save money on memory.

What will be the role of this cluster? VM disks? Object storage? Streaming?...

Jan

On 27 Aug 2015, at 17:56, German Anders <ganders@xxxxxxxxxxxx> wrote:

Hi all,

   I'm planning to deploy a new Ceph cluster with IB FDR 56Gb/s and I've the following HW:

3x MON Servers:
   2x Intel Xeon E5-2600@v3 8C
   256GB RAM

I don't think you need that much memory, 64GB should be plenty (if that's the only role for the servers).

   1xIB FRD ADPT-DP (two ports for PUB network)
   1xGB ADPT-DP
  
   Disk Layout:
  
   SOFT-RAID:
   SCSI1 (0,0,0) (sda) - 120.0 GB ATA INTEL SSDSC2BB12 (OS-RAID1)
   SCSI2 (0,0,0) (sdb) - 120.0 GB ATA INTEL SSDSC2BB12 (OS-RAID1)

I 100% recommend going with SSDs for the /var/lib/ceph/mon storage, fast ones (but they can be fairly small). Should be the same grade as journal drives IMO.
NOT S3500!
I can recommend S3610 (just got some :)), Samsung 845 DC PRO. At least 1 DWPD rating, better go with 3 DWPD.


8x OSD Servers:
   2x Intel Xeon E5-2600@v3 10C

Go for the fastest you can afford if you need the latency - even at the expense of cores.
Go for cores if you want bigger throughput.

   256GB RAM

Again - I think too much if that's the only role for those nodes, 64GB should be plenty.


   1xIB FRD ADPT-DP (one port for PUB and one for CLUS network)
   1xGB ADPT-DP

   Disk Layout:

   SOFT-RAID:
   SCSI1 (0,0,0) (sda) - 120.0 GB ATA INTEL SSDSC2BB12 (OS-RAID1)
   SCSI2 (0,0,0) (sdb) - 120.0 GB ATA INTEL SSDSC2BB12 (OS-RAID1)

   JBOD:
   SCSI9 (0,0,0) (sdd) - 120.0 GB ATA INTEL SC3500 SSDSC2BB12 (Journal)
   SCSI9 (0,1,0) (sde) - 120.0 GB ATA INTEL SC3500 SSDSC2BB12 (Journal)
   SCSI9 (0,2,0) (sdf) - 120.0 GB ATA INTEL SC3500 SSDSC2BB12 (Journal)

No no no. Those SSDs will die a horrible death, too little endurance.
Better go with 2x 3700 in RAID1 and partition them for journals. Or just don't use journaling drives and buy better SSDs for storage.

   SCSI9 (0,3,0) (sdg) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD)
   SCSI9 (0,4,0) (sdh) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD)
   SCSI9 (0,5,0) (sdi) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD)
   SCSI9 (0,6,0) (sdj) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD)

Too little endurance.


   SCSI9 (0,7,0) (sdk) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)
   SCSI9 (0,8,0) (sdl) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)
   SCSI9 (0,9,0) (sdm) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)
   SCSI9 (0,10,0) (sdn) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)
   SCSI9 (0,11,0) (sdo) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)


I would like to have an expert opinion on what would be the best deploy/config disk pools and crush map? any other advice?

Thanks in advance,

Best regards,

German
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux