Re: recommendation for barebones server with 8-12 direct attach NVMe?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> So we were going to replace a Ceph cluster with some hardware we had
>> laying around using SATA HBAs but I was told that the only right way
>> to build Ceph in 2023 is with direct attach NVMe.

My impression are somewhat different:

* Nowadays it is rather more difficult to find 2.5in SAS or SATA
  "Enterprise" SSDs than most NVMe types. NVMe as a host bus
  also has much greater bandwidth than SAS or SATA, but Ceph is
  mostly about IOPS rather than single-device bandwidth. So in
  general willing or less willing one has got to move to NVMe.

* Ceph was designed (and most people have forgotten it) for many
  small capacity 1-OSD cheap servers, and lots of them, but
  unfortunately it is not easy to find small cheap "enterprise"
  SSD servers. In part because many people use rather unwisely
  as figure-of-merit the capacity per server-price most NVMe
  servers have many slots, which means either RAID-ing devices
  into a small number of large OSDs, which goes against all Ceph
  stands for, or running many OSD daemons on one system, which
  work-ish but is not best.

>> Does anyone have any recommendation for a 1U barebones server
>> (we just drop in ram disks and cpus) with 8-10 2.5" NVMe bays
>> that are direct attached to the motherboard without a bridge
>> or HBA for Ceph specifically?

> If you're buying new, Supermicro would be my first choice for
> vendor based on experience.
> https://www.supermicro.com/en/products/nvme

Indeed, SuperMicro does them fairly well, and there are also
GigaByte, and Tyan I think, not yet seen Intel-based models.

> You said 2.5" bays, which makes me think you have existing
> drives. There are models to fit that, but if you're also
> considering new drives, you can get further density in E1/E3

BTW "NVMe" is a bus specification (something not too different
from SCSI-over-PCIe), and there are several different physical
specifications, like 2.5in U.2 (SFF-8639), 2.5in U.3
(SFF-TA-1001), and various types of EDSFF (SFF-TA-1006,7,8). U.3
is still difficult to find but its connector supports SATA, SAS
and NVMe U.2; I have not yet seen EDSFF boxes actually available
retail without enormous delivery times, I guess the big internet
companies buy all the available production.

https://nvmexpress.org/wp-content/uploads/Session-4-NVMe-Form-Factors-Developer-Day-SSD-Form-Factors-v8.pdf
https://media.kingston.com/kingston/content/ktc-content-nvme-general-ssd-form-factors-graph-en-3.jpg
https://media.kingston.com/kingston/pdf/ktc-article-understanding-ssd-technology-en.pdf
https://www.snia.org/sites/default/files/SSSI/OCP%20EDSFF%20JM%20Hands.pdf

> The only caveat is that you will absolutely want to put a
> better NIC in these systems, because 2x10G is easy to saturate
> with a pile of NVME.

That's one reason why Ceph was designed for many small 1-OSD
servers (ideally distributed across several racks) :-). Note: to
maximize changes of many-to-many traffic instead of many-to-one.
Anyhow Ceph again is all about lots of IOPS more than
bandwidth, but if you need bandwidth nowadays many 10Gb NICs
support 25Gb/s too, and 40Gb/s and 100Gb/s are no longer that
expensive (but the cables are horrible).
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux