Hi, Am 23.05.2014 um 16:09 schrieb Dan Van Der Ster <daniel.vanderster at cern.ch>: > Hi, > I think you?re rather brave (sorry, foolish) to store the mon data dir in ramfs. One power outage and your cluster is dead. Even with good backups of the data dir I wouldn't want to go through that exercise. > I know - I?m still testing my env and I don?t really plan to use ramfs in prod, but technically it?s quite interesting ;) > Saying that, we had a similar disk-io-bound problem with the mon data dirs, and solved it by moving the mons to SSDs. Maybe in your case using the cfq io scheduler would help, since at least then the OSD and MON processes would get fair shares of the disk IOs. Oh, when did they switch the default sched to deadline? Thanks for the hint, moved to cfq - tests are running. > Anyway, to backup the data dirs, you need to stop the mon daemon to get a consistent leveldb before copying the data to a safe place. Well, this wouldn?t be a real problem, but I?m worrying about how effective this would be? Is it enough to restore such a backup even if in the meantime (since the backup was done) data-objects have changed? I don?t think so :( Conclude: * ceph would stop/freeze as soon as amount of nodes is less than quorum * ceph would continue to work as soon as node go up again * I could create a fresh mon on every node directly on boot by importing current state " ceph-mon --force-sync --yes-i-really-mean-it ..." So, as long as there are enough mon to build the quorum, it should work with ramfs. If nodes fail one by one, ceph would stop if quorum is lost and continue if nodes are back. But if all nodes stop (f.e. poweroutage) my ceph-cluster is dead and backups wouldn?t prevent this, isn?t it? Maybe snapshotting the pool could help? Backup: * create a snapshot * shutdown one mon * backup mon-dir Restore: * import mon-dir * create further mons until quorum is restored * restore snapshot Possible?.. :D Thanks, Fabian -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140523/9f4b0260/attachment.htm> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140523/9f4b0260/attachment.pgp>