Re: Improving Performance with more OSD's?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 28 Dec 2014 08:59:33 +1000 Lindsay Mathieson wrote:

> I'm looking to improve the raw performance on my small setup (2 Compute
> Nodes, 2 OSD's). Only used for hosting KVM images.
> 
This doesn't really make things clear, do you mean 2 STORAGE nodes with 2
OSDs (HDDs) each?
In either case that's a very small setup (and with a replication of 2 a
risky one, too), so don't expect great performance.

It would help if you'd tell us what these nodes are made of
(CPU, RAM, disks, network) so we can at least guess what that cluster
might be capable of.

> Raw read/write is roughly 200/35 MB/s. Starting 4+ VM's simultaneously
> pushes iowaits over 30%, though the system keeps chugging along.
> 
Throughput numbers aren't exactly worthless, but you will find IOPS to be
the killer in most cases. Also without describing how you measured these
numbers (rados bench, fio, bonnie, on the host, inside a VM) they become
even more muddled. 

> Budget is limited ... :(
> 
> I plan to upgrade my SSD journals to something better than the Samsung
> 840 EVO's (Intel 520/530?)
> 
Not a big improvement really. 
Take a look at the 100GB Intel DC S3700s, while they can write "only" at
200MB/s they are priced rather nicely and they will deliver that
performance at ANY time and for a looooong time, too.

> One of the things I see mentioned a lot in blogs etc is how ceph's
> performance improves as you add more OSD's and that the quality of the
> disks does not matter so much as the quantity.
> 
> How does this work? does ceph stripe reads and writes across the OSD's
> to improve performance?
> 
Yes and no. It stripes by default to 4MB objects, so with enough OSDs and
clients I/Os will become distributed, scaling up nicely. However a single
client could be hitting the same object on the same OSD all the time
(small DB file for example), so you won't see much or any improvement in
that case.
There is also the option to stripe things on a much smaller scale, however
that takes some planning and needs to be done at pool creation time. 
See and read the Ceph documentation.

> If I add 3 cheap OSD's to each node (500GB - 1TB) with 10GB SSD journal 
> partition each could I expect a big improvement in performance?
> 
That depends a lot on the stuff you haven't told us (CPU/RAM/network).
Given that there is sufficient of those, especially CPU, the answer is yes.
A large amount of RAM on the storage nodes will improve reads, as hot
objects become and remain cached.

Of course having decent HDDs will help even with journals on SSDs, for
example the Toshiba DTxx (totally not recommended for ANYTHING) HDDs
cost about the same as their entry level "enterprise" MG0x drives, which
are nearly twice as fast in the IOPS department.

> What sort of redundancy to setup? currently its min= 1, size=2. Size is
> not an issue, we already have 150% more space than we need, redundancy
> and performance is more important.
> 
You really, really want size 3 and a third node for both performance
(reads) and redundancy.

> Now I think on it, we can live with the slow write performance, but
> reducing iowait would be *really* good.
> 
Decent SSDs (see above) and more (decent) spindles will help with both.

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux