Ceph Very Small Cluster

Ranjan Ghosh <ghosh@xxxxxx> · Wed, 28 Sep 2016 17:03:39 +0200

Hi everyone,

Up until recently, we were using GlusterFS to have two web servers in 
sync so we could take one down and switch back and forth between them - 
e.g. for maintenance or failover. Usually, both were running, though. 
The performance was abysmal, unfortunately. Copying many small files on 
the file system caused outages for several minutes - simply 
unacceptable. So I found Ceph. It's fairly new but I thought I'd give it 
a try. I liked especially the good, detailed documentation, the 
configurability and the many command-line tools which allow you to find 
out what is going on with your Cluster. All of this is severly lacking 
with GlusterFS IMHO.

Because we're on a very tiny budget for this project we cannot currently 
have more than two file system servers. I added a small Virtual Server, 
though, only for monitoring. So at least we have 3 monitoring nodes. I 
also created 3 MDS's, though as far as I understood, two are only for 
standby. To sum it up, we have:

server0: Admin (Deployment started from here) + Monitor + MDS
server1: Monitor + MDS + OSD
server2: Monitor + MDS + OSD

So, the OSD is on server1 and server2 which are next to each other 
connected by a local GigaBit-Ethernet connection. The cluster is mounted 
(also on server1 and server2) as /var/www and Apache is serving files 
off the cluster.

I've used these configuration settings:

osd pool default size = 2
osd pool default min_size = 1

My idea was that by default everything should be replicated on 2 servers 
i.e. each file is normally written on server1 and server2. In case of 
emergency though (one server has a failure), it's better to keep 
operating and only write the file to one server. Therefore, i set 
min_size = 1. My further understanding is (correct me if I'm wrong), 
that when the server comes back online, the files that were written to 
only 1 server during the outage will automatically be replicated to the 
server that has come back online.

So far, so good. With two servers now online, the performance is 
light-years away from sluggish GlusterFS. I've also worked with 
XtreemFS, OCFS2, AFS and never had such a good performance with any 
Cluster. In fact it's so blazingly fast, that I had to check twice I 
really had the cluster mounted and wasnt accidentally working on the 
hard drive. Impressive. I can edit files on server1 and they are 
immediately changed on server2 and vice versa. Great!

Unfortunately, when I'm now stopping all ceph-Services on server1, the 
websites on server2 start to hang/freeze. And "ceph health" shows "#x 
blocked requests". Now, what I don't understand: Why is it blocking? 
Shouldnt both servers have the file? And didn't I set min_size to "1"? 
And if there are a few files (could be some unimportant stuff) that's 
missing on one of the servers: How can I abort the blocking? I'd rather 
have a missing file or whatever, then a completely blocking website.

Are my files really duplicated 1:1 - or are they perhaps spread evenly 
between both OSDs? Do I have to edit the crushmap to achieve a real 
"RAID-1"-type of replication? Is there a command to find out for a 
specific file where it actually resides and whether it has really been 
replicated?

Thank you!
Ranjan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com