Re: Disk/Pool Layout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256




On Thu, Aug 27, 2015 at 10:25 AM, Jan Schermer  wrote:
> Some comments inline.
> A lot of it depends on your workload, but I'd say you almost certainly need
> higher-grade SSDs. You can save money on memory.
>
> What will be the role of this cluster? VM disks? Object storage?
> Streaming?...
>
> Jan
>
> On 27 Aug 2015, at 17:56, German Anders  wrote:
>
> Hi all,
>
>    I'm planning to deploy a new Ceph cluster with IB FDR 56Gb/s and I've the
> following HW:
>
> 3x MON Servers:
>    2x Intel Xeon E5-2600@v3 8C

This is overkill if only a monitor server.

>
>    256GB RAM
>
>
> I don't think you need that much memory, 64GB should be plenty (if that's
> the only role for the servers).


If it is only monitor, you can get by with even less.

>
>    1xIB FRD ADPT-DP (two ports for PUB network)
>    1xGB ADPT-DP
>
>    Disk Layout:
>
>    SOFT-RAID:
>    SCSI1 (0,0,0) (sda) - 120.0 GB ATA INTEL SSDSC2BB12 (OS-RAID1)
>    SCSI2 (0,0,0) (sdb) - 120.0 GB ATA INTEL SSDSC2BB12 (OS-RAID1)
>
>
> I 100% recommend going with SSDs for the /var/lib/ceph/mon storage, fast
> ones (but they can be fairly small). Should be the same grade as journal
> drives IMO.
> NOT S3500!
> I can recommend S3610 (just got some :)), Samsung 845 DC PRO. At least 1
> DWPD rating, better go with 3 DWPD.

S3500 should be just fine here. I get 25% better performance on the
S3500 vs the S3700 doing sync direct writes. Write endurance should be
just fine as the volume of data is not going to be that great. Unless
there is something else I'm not aware of.

>
>
> 8x OSD Servers:
>    2x Intel Xeon E5-2600@v3 10C
>
>
> Go for the fastest you can afford if you need the latency - even at the
> expense of cores.
> Go for cores if you want bigger throughput.

I'm in the middle of my testing, but it seems that with lots of I/O
depth (either from a single client or multiple clients) that clock
speed does not have as much of an impact as core count does. Once I'm
done, I'll be posting my results. Unless you have a single client that
has a QD=1, go for cores at this point.

>
>    256GB RAM
>
>
> Again - I think too much if that's the only role for those nodes, 64GB
> should be plenty.

Agree, if you can afford more RAM, it just means more page cache.

>
>
>    1xIB FRD ADPT-DP (one port for PUB and one for CLUS network)
>    1xGB ADPT-DP
>
>    Disk Layout:
>
>    SOFT-RAID:
>    SCSI1 (0,0,0) (sda) - 120.0 GB ATA INTEL SSDSC2BB12 (OS-RAID1)
>    SCSI2 (0,0,0) (sdb) - 120.0 GB ATA INTEL SSDSC2BB12 (OS-RAID1)
>
>    JBOD:
>    SCSI9 (0,0,0) (sdd) - 120.0 GB ATA INTEL SC3500 SSDSC2BB12 (Journal)
>    SCSI9 (0,1,0) (sde) - 120.0 GB ATA INTEL SC3500 SSDSC2BB12 (Journal)
>    SCSI9 (0,2,0) (sdf) - 120.0 GB ATA INTEL SC3500 SSDSC2BB12 (Journal)
>
>
> No no no. Those SSDs will die a horrible death, too little endurance.
> Better go with 2x 3700 in RAID1 and partition them for journals. Or just
> don't use journaling drives and buy better SSDs for storage.

If he is only using these for journals, he can be just fine. He can
get the same endurance as the S3700 by only using a portion of the
drive space. [1][2]

>
>    SCSI9 (0,3,0) (sdg) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD)
>    SCSI9 (0,4,0) (sdh) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD)
>    SCSI9 (0,5,0) (sdi) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD)
>    SCSI9 (0,6,0) (sdj) - 800.2 GB ATA INTEL SC3510 SSDSC2BB80 (Pool-SSD)
>
>
> Too little endurance.
>

Same as above


[1] http://www.sandisk.com/assets/docs/WP004_OverProvisioning_WhyHow_FINAL.pdf
[2] http://storage.toshiba.com/docs/services-support-documents/ssd_application_note.pdf



- ----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

>
>    SCSI9 (0,7,0) (sdk) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)
>    SCSI9 (0,8,0) (sdl) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)
>    SCSI9 (0,9,0) (sdm) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)
>    SCSI9 (0,10,0) (sdn) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)
>    SCSI9 (0,11,0) (sdo) - 3.0 TB SEAGATE ST3000NM0023 (Pool-SATA)
>
>
> I would like to have an expert opinion on what would be the best
> deploy/config disk pools and crush map? any other advice?
>
> Thanks in advance,
>
> Best regards,
>
> German
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v1.0.2
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJV312OCRDmVDuy+mK58QAAeeIQALYXTexJrjBtm6QskUgy
dxjqCmxpMqRs1ryQSqfpQeR8UjryIxvbYkd/Nx+IroNSnJrHLLQoHMXb5ib3
bnCVl/Sgqkm6Tz+u/jtWyYPRf5oRAgpU+ApBTeYwuSNRf+e5yTYCeHFjm/2N
3HJ4SyAblpYRCq4fOr4t/p7mWlXomiIJiXfOLKXnBtZoYgr89/u4rekht4O5
fVZqsJrzRRcD5yp56Mz9+MlwBt+zlPkNxB6tg4L1GkIfGkZDa2tqclOzJFxa
1g9702TkBbtg0xFyttb0LcH98tkppjXpOOPkfsdIznQ1QpgcBwUBDN9+/88c
JLjcdH9lrauQTcnBPavtSGM/jlzERN6gNvFMeOz9b7596++ffwWN/nrAoIAL
wUSaAg+1ENSorcwkWXxwZOocdpkiX4Z6tJVis2i0cvvdtpclJue/bHZnxmRm
ddO6DDouVV1Q19hbqr/C7aGpJsqwBLfiVzFNHeIGjOp4J7nP8w4RsNXsRWsW
v2O6B3uqaCJpPmimA8fNj6gEfgYdQThxXBcnypC98wdtavSMndjs/QY7G1Eo
vyUsKHSqiv+C0g6DRJr6cpm3CoMNzBEnRIbCfZawmVGsjEZkgmelapa3RQNy
p6BM3UymWYfpA+/WkJmJxlSUf9RpWOvXdmRmLDwuiDhfk5jWL42ejt5lGenG
G0bP
=5Tyc
-----END PGP SIGNATURE-----
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux