On Wed, Sep 28, 2016 at 8:03 AM, Ranjan Ghosh <ghosh@xxxxxx> wrote: > Hi everyone, > > Up until recently, we were using GlusterFS to have two web servers in sync > so we could take one down and switch back and forth between them - e.g. for > maintenance or failover. Usually, both were running, though. The performance > was abysmal, unfortunately. Copying many small files on the file system > caused outages for several minutes - simply unacceptable. So I found Ceph. > It's fairly new but I thought I'd give it a try. I liked especially the > good, detailed documentation, the configurability and the many command-line > tools which allow you to find out what is going on with your Cluster. All of > this is severly lacking with GlusterFS IMHO. > > Because we're on a very tiny budget for this project we cannot currently > have more than two file system servers. I added a small Virtual Server, > though, only for monitoring. So at least we have 3 monitoring nodes. I also > created 3 MDS's, though as far as I understood, two are only for standby. To > sum it up, we have: > > server0: Admin (Deployment started from here) + Monitor + MDS > server1: Monitor + MDS + OSD > server2: Monitor + MDS + OSD > > So, the OSD is on server1 and server2 which are next to each other connected > by a local GigaBit-Ethernet connection. The cluster is mounted (also on > server1 and server2) as /var/www and Apache is serving files off the > cluster. > > I've used these configuration settings: > > osd pool default size = 2 > osd pool default min_size = 1 > > My idea was that by default everything should be replicated on 2 servers > i.e. each file is normally written on server1 and server2. In case of > emergency though (one server has a failure), it's better to keep operating > and only write the file to one server. Therefore, i set min_size = 1. My > further understanding is (correct me if I'm wrong), that when the server > comes back online, the files that were written to only 1 server during the > outage will automatically be replicated to the server that has come back > online. > > So far, so good. With two servers now online, the performance is light-years > away from sluggish GlusterFS. I've also worked with XtreemFS, OCFS2, AFS and > never had such a good performance with any Cluster. In fact it's so > blazingly fast, that I had to check twice I really had the cluster mounted > and wasnt accidentally working on the hard drive. Impressive. I can edit > files on server1 and they are immediately changed on server2 and vice versa. > Great! > Nice!, Thanks for sharing your details. > Unfortunately, when I'm now stopping all ceph-Services on server1, the > websites on server2 start to hang/freeze. And "ceph health" shows "#x > blocked requests". Now, what I don't understand: Why is it blocking? > Shouldnt both servers have the file? And didn't I set min_size to "1"? And > if there are a few files (could be some unimportant stuff) that's missing on > one of the servers: How can I abort the blocking? I'd rather have a missing > file or whatever, then a completely blocking website. Are all the pools using min_size 1? did you check pg stat and see which ones are waiting? some steps to debug further and check http://docs.ceph.com/docs/jewel/rados/operations/monitoring-osd-pg/ Also did you shutdown the server abruptly while it was busy? > > Are my files really duplicated 1:1 - or are they perhaps spread evenly > between both OSDs? Do I have to edit the crushmap to achieve a real > "RAID-1"-type of replication? Is there a command to find out for a specific > file where it actually resides and whether it has really been replicated? > > Thank you! > Ranjan > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com