We're running 12 OSDs per node, with 32 hyper-threaded CPUs available. We over-provisioned the CPUs because we would like to additionally run jobs from our batch system and isolate them via cgroups (we're a high-throughput computing facility). . With a total of ~13000 pgs across a few pools, I'm seeing about 1GB of resident memory per OSD. As far as EC plugins go, we're using jerasure and haven't experimented with others.
That said, in our use case we're using CephFS, so we're fronting the erasure-coded pool with a cache tier. The cache pool is limited to 5TB, and right now usage is light enough that most operations live in the cache tier and rarely get flushed out to the EC pool. I'm sure as we bring more users onto this, there will be some more tweaking to do.
As far as performance goes, you might want to read Mark Nelson's excellent document about EC performance under Firefly. If you search the list archives, he sent a mail in February titled "Erasure Coding CPU Overhead Data". I can forward you the PDF off-list if you would like.
--Lincoln
On Jun 19, 2015, at 12:42 PM, Sean wrote:
Thanks lincoln! May I ask how many drives you have per storage node
and how many threads you have available? IE are you using hyper
threading and do you have more than 24 disks per node in your
cluster? I noticed with our replicated cluster that disks == more
pgs == more cpu/ram and with 24+ disks this ends up causing issues
in some cases. So a 3 node cluster with 70 disks each is fine but
scaling up to 21 and i see issues. Even with connections, pids, and
file descriptors turned up. Are you using just jerasure or have you
tried the ISA driver as well?
Sorry for bombarding you with questions I am just curious as to
where the 40% performance comes from.
On 06/19/2015 11:05 AM, Lincoln Bryant
wrote:
Hi Sean,
We have ~1PB of EC storage using Dell R730xd servers with 6TB
OSDs. We've got our erasure coding profile set up to be k=10,m=3
which gives us a very reasonable chunk of the raw storage with
nice resiliency.
I found that CPU usage was significantly higher in EC, but
not so much as to be problematic. Additionally, EC performance
was about 40% of replicated pool performance in our testing.
With 36-disk servers you'll probably need to make sure you do
the usual kernel tweaks like increasing the max number of file
descriptors, etc.
Cheers,
Lincoln
On Jun 19, 2015, at 10:36 AM, Sean wrote:
I am looking to use Ceph using EC on a few leftover storage servers (36 disk supermicro servers with dual xeon sockets and around 256Gb of ram). I did a small test using one node and using the ISA library and noticed that the CPU load was pretty spikey for just normal operation.
Does anyone have any experience running Ceph EC on around 216 to 270 4TB disks? I'm looking to yield around 680 TB to 1PB if possible. just putting my feelers out there to see if anyone else has had any experience and looking for any guidance.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|