Fwd: Bad performance of CephFS (first use)

michal.pazdera@xxxxxxxxx (Michal Pazdera) · Fri, 09 May 2014 23:03:50 +0200

Dne 9.5.2014 9:08, Christian Balzer napsal(a):
> Is that really just one disk?

Yes, its just one disk in all PCs. I know that the setup is bad, but I 
want just to get
familiar with Ceph (and other parallel fs like Gluster ot Lustre) and 
see what they can
do and cannot.

> You have the reason for the write performance half right.
> Every write goes to the primary OSD of the PG for that object.
> That is, the journal of that OSD, which in your configuration I suspect is
> a file on the same XFS as the actual OSD data. Either way, it would be on
> the same disk as you only have one.
> So that write goes to the primary OSD journal, then gets replicated to the
> journal of the secondary OSD, then it get's ACK'ed to the client.
> Meanwhile the journals will have to get written to the actual storage
> eventually.

So the client PC writes on one OSD to the journal and then the OSD 
replicates the data
from that journal to the second OSD also into its journal. Only after 
that the data on each OSD are
copied from the journals into actual OSD storage ? Interesting in that 
case klient should write
around 100MB/s to one OSD then it should stop and wait for that OSD to 
replicate dato onto
second OSD (also around 100MB/s) and then all is done. After the disk 
with journlas and storage
space should copy all the data on itselfs.

So the journal is some kind of cache for OSDs?

 From the graphs i got it seems that the klient is sending data to both 
OSD in parallel into the journals.
Then each of the OSDs copy the data once more on itselfs (not sure). But 
i dont know why the network
traffic has these spikes. Is it because the client writes some chunk of 
data and then waits for something
before next chunk can be sent ?

> So each write happens basically twice, your single disk now only has an
> effective speed of around 60-70MB/s (couldn't find any benchmarks for your
> model, but most drives of this type have write speeds up to 140MB/s).

They can work around 100MB/s for sure.

> Now add to this the fact that the replication from the other OSD will of
> course also impact things.
> That network bandwidth for the replication has to come from somewhere...
>
> Look at what the recommended configurations by Inktank are and at previous
> threads in here to get an idea what helps.
>
> Since I doubt you have the budget or parts for your test setup to add more
> disks, SSDs for journals, HW cache controllers, additional network cards
> and so forth I guess you will have to live with this performance for now.
>
> Christian
>
> -- Christian Balzer Network/Systems Engineer chibi at gol.com Global 
> OnLine Japan/Fusion Communications http://www.gol.com/

I will do that. Thank you very much for your reply!

---
Tato zpr?va neobsahuje viry ani jin? ?kodliv? k?d - avast! Antivirus je aktivn?.
http://www.avast.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140509/903609b3/attachment.htm>