Re: Possible improvements for a slow write speed (excluding independent SSD journals)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Thu, 23 Apr 2015 18:40:38 -0400 Anthony Levesque wrote:

> To update you on the current test in our lab:
> 
> 1.We tested the Samsung OSD in Recovery mode and the speed was able to
> maxout 2x 10GbE port(transferring data at 2200+ MB/s during recovery).
> So for normal write operation without O_DSYNC writes Samsung drives seem
> ok.
> 
> 2.We then tested a couple of different model of SSD we had in stock with
> the following command:
> 
> dd if=randfile of=/dev/sda bs=4k count=100000 oflag=direct,dsync
> 
> This was from a blog written by Sebastien Han and I think should be able
> to show how the drives would perform in O_DSYNC writes. For people
> interested in some result of what we tested here they are:
> 
> Intel DC S3500 120GB = 		114 MB/s
> Samsung Pro 128GB =		2.4 MB/s
> WD Black 1TB (HDD) =		409 KB/s
> Intel 330 120GB =			105 MB/s
> Intel 520 120GB =			9.4 MB/s
> Intel 335 80GB = 			9.4 MB/s
> Samsung EVO 1TB = 		2.5 MB/s
> Intel 320 120GB = 			78 MB/s
> OCZ Revo Drive 240GB =	60.8 MB/s
> 4x Samsung EVO 1TB LSI RAID0 HW + BBU = 	28.4 MB/s
>
No real surprises here, but a nice summary nonetheless. 

You _really_ want to avoid consumer SSDs for journals and have a good idea
on how much data you'll write per day and how long you expect your SSDs to
last (the TBW/$ ratio).

> Please let us know if the command we ran was not optimal to test O_DSYNC
> writes
> 
> We order larger drive from Intel DC series to see if we could get more
> than 200 MB/s per SSD. We will keep you posted on tests if that
> interested you guys. We dint test multiple parallel test yet (to
> simulate multiple journal on one SSD).
> 
You can totally trust the numbers on Intel's site:
http://ark.intel.com/products/family/83425/Data-Center-SSDs

The S3500s are by far the slowest and have the lowest endurance.
Again, depending on your expected write level the S3610 or S3700 models
are going to be a better fit regarding price/performance. 
Especially when you consider that loosing a journal SSD will result in
several dead OSDs. 

> 3.We remove the Journal from all Samsung OSD and put 2x Intel 330 120GB
> on all 6 Node to test.  The overall speed we were getting from the rados
> bench went from 1000 MB/s(approx.) to 450 MB/s which might only be
> because the intel cannot do too much in term of journaling (was tested
> at around 100 MB/s).  It will be interesting to test with bigger Intel
> DC S3500 drives(and more journals) per node to see if I can back up to
> 1000MB/s or even surpass it.
> 
> We also wanted to test if the CPU could be a huge bottle neck so we swap
> the Dual E5-2620v2 from node #6 and replace them with Dual
> E5-2609v2(Which are much smaller in core and speed) and the 450 MB/s we
> got from he rados bench went even lower to 180 MB/s.
> 
You really don't have to swap CPUs around, monitor things with atop or
other tools to see where your bottlenecks are.

> So Im wondering if the 1000MB/s we got when the Journal was shared on
> the OSD SSD was not limited by the CPUs (even though the samsung are not
> good for journals on the long run) and not just by the fact Samsung SSD
> are bad in O_DSYNC writes(or maybe both).  It is probable that 16 SSD
> OSD per node in a full SSD cluster is too much and the major bottleneck
> will be from the CPU.
> 
That's what I kept saying. ^.^

> 4.Im wondering if we find good SSD for the journal and keep the samsung
> for normal writes and read(We can saturate 20GbE easy with read
> benchmark. We will test 40GbE soon) if the cluster will keep healthy
> since Samsung seem to get burnt from O_DSYNC writes.
> 
They will get burned, as in have their cells worn out by any and all
writes.

> 5.In term of HBA controller, did you guys have made any test for a full
> SSD cluster or even just for SSD Journal.
> 
If you have separate journals and OSDs, it often makes good sense to have
them on separate controllers as well. 
It all depends on density of your setup and capabilities of the
controllers.
LSI HBAs in IT mode are a known and working entity.

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux