Re: How to backup hundreds or thousands of TB

Steve Anthony <sma310@xxxxxxxxxx> · Wed, 06 May 2015 15:37:32 -0400

    We have a separate primary and backup cluster running in two
    distinct physical locations serving rbd images (totaling ~12TB at
    the moment) to CIFS/NFS/iSCSI reshare hosts, serving clients . I do
    daily snapshots on the primary cluster and then
    export-diff/import-diff on the backup cluster, and then rotate the
    snapshots. This covers us if:

    - Someone deletes data which has been picked up by the previous
    snapshot. I can mount the snapshot and I or the user can recover the
    affected file.

    - We run into something like J-P, where I get stuck and lose data on
    the primary cluster due to either a Ceph bug/error or mistake on my
    part.

    - We lose the primary site: flood/fire, etc.

    As a bonus, the backup cluster gives me a place to test
    upgrades/configuration changes before committing to them on the
    production system.

    I think Hammer allows for synchronous RBD mirroring between clusters
    (haven't played with that yet), and it looks like async mirroring is
    on the roadmap for Infernalis:

     https://wiki.ceph.com/Planning/Blueprints/Infernalis/RBD_Async_Mirroring

    -Steve

    On 05/06/2015 01:31 PM, J-P Methot
      wrote:

      Case in point, here's a little story
        as to why backup outside ceph is necessary:

        I was working on modifying journal locations for a running test
        ceph cluster when, after bringing back a few OSD nodes, two PGs
        started being marked as incomplete. That made all operations on
        the pool hang as, for some reason, rbd clients couldn't read the
        missing PG and there was no timeout value for their operation.
        After spending half a day fixing this, I ended up needing to
        delete the pool and then recreate it. Thankfully that setup was
        not in production so it was only a minor setback.

        So, when we go in production with our setup, we are planning to
        have a second ceph for backups, just in case such an issue
        happens again. I don't want to scare anyone and I'm pretty sure
        my issue was very exceptional, but no matter how well ceph
        replicate and ensures data safety, backups are still a good
        idea, in my humble opinion. 

        On 5/6/2015 6:35 AM, Mariusz Gronczewski wrote:

        Snapshot on same storage cluster should definitely NOT be treated as
backup

Snapshot as a source for backup however can be pretty good solution for
some cases, but not every case.

For example if using ceph to serve static web files, I'd rather have
possibility to restore given file from given path than snapshot of
whole multiple TB cluster.

There are 2 cases for backup restore:

* something failed, need to fix it - usually full restore needed
* someone accidentally removed a thing, and now they need a thing back

Snapshots fix first problem, but not the second one, restoring 7TB of
data to recover few GBs is not reasonable.

As it is now we just backup from inside VMs (file-based backup) and have
puppet to easily recreate machine config but if (or rather when) we
would use object store we would backup it in a way that allows for
partial restore.

On Wed, 6 May 2015 10:50:34 +0100, Nick Fisk <nick@xxxxxxxxxx> wrote:

          For me personally I would always feel more comfortable with backups on a completely different storage technology.

Whilst there are many things you can do with snapshots and replication, there is always a small risk that whatever causes data loss on your primary system may affect/replicate to your 2nd copy.

I guess it all really depends on what you are trying to protect against, but Tape still looks very appealing if you want to maintain a completely isolated copy of data.

            -----Original Message-----
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
Alexandre DERUMIER
Sent: 06 May 2015 10:10
To: Götz Reinicke
Cc: ceph-users
Subject: Re:  How to backup hundreds or thousands of TB

for the moment, you can use snapshot for backup

https://ceph.com/community/blog/tag/backup/

I think that async mirror is on the roadmap
https://wiki.ceph.com/Planning/Blueprints/Hammer/RBD%3A_Mirroring

if you use qemu, you can do qemu full backup. (qemu incremental backup is
coming for qemu 2.4)

----- Mail original -----
De: "Götz Reinicke" <goetz.reinicke@xxxxxxxxxxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Mercredi 6 Mai 2015 10:25:01
Objet:  How to backup hundreds or thousands of TB

Hi folks,

beside hardware and performance and failover design: How do you manage
to backup hundreds or thousands of TB :) ?

Any suggestions? Best practice?

A second ceph cluster at a different location? "bigger archive" Disks in good
boxes? Or tabe-libs?

What kind of backupsoftware can handle such volumes nicely?

Thanks and regards . Götz
--
Götz Reinicke
IT-Koordinator

Tel. +49 7141 969 82 420
E-Mail goetz.reinicke@xxxxxxxxxxxxxxx

Filmakademie Baden-Württemberg GmbH
Akademiehof 10
71638 Ludwigsburg
www.filmakademie.de

Eintragung Amtsgericht Stuttgart HRB 205016

Vorsitzender des Aufsichtsrats: Jürgen Walter MdL Staatssekretär im
Ministerium für Wissenschaft, Forschung und Kunst Baden-Württemberg

Geschäftsführer: Prof. Thomas Schadt

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

        _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

      -- 
======================
Jean-Philippe Méthot
Administrateur système / System administrator
GloboTech Communications
Phone: 1-514-907-0050
Toll Free: 1-(888)-GTCOMM1
Fax: 1-(514)-907-0750
jpmethot@xxxxxxxxxx
http://www.gtcomm.net

      _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

    -- 
Steve Anthony
LTS HPC Support Specialist
Lehigh University
sma310@xxxxxxxxxx

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com