We have a separate primary and backup cluster running in two
distinct physical locations serving rbd images (totaling ~12TB at
the moment) to CIFS/NFS/iSCSI reshare hosts, serving clients . I do
daily snapshots on the primary cluster and then
export-diff/import-diff on the backup cluster, and then rotate the
snapshots. This covers us if:
- Someone deletes data which has been picked up by the previous
snapshot. I can mount the snapshot and I or the user can recover the
affected file.
- We run into something like J-P, where I get stuck and lose data on
the primary cluster due to either a Ceph bug/error or mistake on my
part.
- We lose the primary site: flood/fire, etc.
As a bonus, the backup cluster gives me a place to test
upgrades/configuration changes before committing to them on the
production system.
I think Hammer allows for synchronous RBD mirroring between clusters
(haven't played with that yet), and it looks like async mirroring is
on the roadmap for Infernalis:
https://wiki.ceph.com/Planning/Blueprints/Infernalis/RBD_Async_Mirroring
-Steve
On 05/06/2015 01:31 PM, J-P Methot
wrote:
Case in point, here's a little story
as to why backup outside ceph is necessary:
I was working on modifying journal locations for a running test
ceph cluster when, after bringing back a few OSD nodes, two PGs
started being marked as incomplete. That made all operations on
the pool hang as, for some reason, rbd clients couldn't read the
missing PG and there was no timeout value for their operation.
After spending half a day fixing this, I ended up needing to
delete the pool and then recreate it. Thankfully that setup was
not in production so it was only a minor setback.
So, when we go in production with our setup, we are planning to
have a second ceph for backups, just in case such an issue
happens again. I don't want to scare anyone and I'm pretty sure
my issue was very exceptional, but no matter how well ceph
replicate and ensures data safety, backups are still a good
idea, in my humble opinion.
On 5/6/2015 6:35 AM, Mariusz Gronczewski wrote:
Snapshot on same storage cluster should definitely NOT be treated as
backup
Snapshot as a source for backup however can be pretty good solution for
some cases, but not every case.
For example if using ceph to serve static web files, I'd rather have
possibility to restore given file from given path than snapshot of
whole multiple TB cluster.
There are 2 cases for backup restore:
* something failed, need to fix it - usually full restore needed
* someone accidentally removed a thing, and now they need a thing back
Snapshots fix first problem, but not the second one, restoring 7TB of
data to recover few GBs is not reasonable.
As it is now we just backup from inside VMs (file-based backup) and have
puppet to easily recreate machine config but if (or rather when) we
would use object store we would backup it in a way that allows for
partial restore.
On Wed, 6 May 2015 10:50:34 +0100, Nick Fisk <nick@xxxxxxxxxx> wrote:
For me personally I would always feel more comfortable with backups on a completely different storage technology.
Whilst there are many things you can do with snapshots and replication, there is always a small risk that whatever causes data loss on your primary system may affect/replicate to your 2nd copy.
I guess it all really depends on what you are trying to protect against, but Tape still looks very appealing if you want to maintain a completely isolated copy of data.
-----Original Message-----
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
Alexandre DERUMIER
Sent: 06 May 2015 10:10
To: Götz Reinicke
Cc: ceph-users
Subject: Re: How to backup hundreds or thousands of TB
for the moment, you can use snapshot for backup
https://ceph.com/community/blog/tag/backup/
I think that async mirror is on the roadmap
https://wiki.ceph.com/Planning/Blueprints/Hammer/RBD%3A_Mirroring
if you use qemu, you can do qemu full backup. (qemu incremental backup is
coming for qemu 2.4)
----- Mail original -----
De: "Götz Reinicke" <goetz.reinicke@xxxxxxxxxxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Mercredi 6 Mai 2015 10:25:01
Objet: How to backup hundreds or thousands of TB
Hi folks,
beside hardware and performance and failover design: How do you manage
to backup hundreds or thousands of TB :) ?
Any suggestions? Best practice?
A second ceph cluster at a different location? "bigger archive" Disks in good
boxes? Or tabe-libs?
What kind of backupsoftware can handle such volumes nicely?
Thanks and regards . Götz
--
Götz Reinicke
IT-Koordinator
Tel. +49 7141 969 82 420
E-Mail goetz.reinicke@xxxxxxxxxxxxxxx
Filmakademie Baden-Württemberg GmbH
Akademiehof 10
71638 Ludwigsburg
www.filmakademie.de
Eintragung Amtsgericht Stuttgart HRB 205016
Vorsitzender des Aufsichtsrats: Jürgen Walter MdL Staatssekretär im
Ministerium für Wissenschaft, Forschung und Kunst Baden-Württemberg
Geschäftsführer: Prof. Thomas Schadt
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
======================
Jean-Philippe Méthot
Administrateur système / System administrator
GloboTech Communications
Phone: 1-514-907-0050
Toll Free: 1-(888)-GTCOMM1
Fax: 1-(514)-907-0750
jpmethot@xxxxxxxxxx
http://www.gtcomm.net
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Steve Anthony
LTS HPC Support Specialist
Lehigh University
sma310@xxxxxxxxxx
|