Backup & Restore?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The short answer is "no".  The longer answer is "it depends".  The most 
concise discussion I've seen is Inktank's Multi-site option whitepaper: 
http://info.inktank.com/multisite_options_with_inktank_ceph_enterprise

That white paper only addresses RBD backups (using snapshots) and 
RadosGW backups (using RadosGW replication).  The first option in the 
whitepaper, a single cluster in multiple location, isn't a backup.

I'm not aware of any backup or offsite capability for raw RADOS pools.


There really aren't any good options for backing up CephFS.  You could 
use rsync on CephFS, but it's not going to work well.  rsync to offsite 
locations begins to have problems around the TB size, give or take an 
order of magnitude.  The exact spot depends on your bandwidth, latency, 
file count, average file size, average file churn, and Disk I/O on both 
sides.  It takes a lot of time and Disk I/O to enumerate all the files 
on the filesystem, and compare them to the offsite copy.  CephFS does 
have some nice features that could make for an efficient backup.  If 
rsync (or any backup client) was aware of the way CephFS handles 
directory size and timestamp, it could prune the directory tree 
enumeration much more efficiently.  That should scale well to much 
larger file systems, mostly limited by file churn and churn locality.  I 
don't know of anybody that's working on that.  I'm interested in the 
concept, but I have no plans (personal or professional) to use CephFS.


I'm currently working on adding Snapshot capabilities to RadosGW. 
Combined with replication, it can protect against disasters, PEBKAC, and 
application error.  Replication alone only protects against disasters, 
but not PEBKAC nor application errors.  Just like RAID protects against 
disk failure, but not file deletion.


Replication + Snapshots (for both RadosGW and RBD) don't protect against 
a determined attacker.  Even tape is vulnerable to a determined attacker 
with a high security level in your organization.  The trick with both 
offline backups and remote snapshots is to set up enough barriers and 
checks that things get noticed before a determined attacker can finish 
the job.  It's easier to do with offline backups than online backups.




*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email clewis at centraldesktop.com <mailto:clewis at centraldesktop.com>

*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website <http://www.centraldesktop.com/>  | Twitter 
<http://www.twitter.com/centraldesktop>  | Facebook 
<http://www.facebook.com/CentralDesktop>  | LinkedIn 
<http://www.linkedin.com/groups?gid=147417>  | Blog 
<http://cdblog.centraldesktop.com/>

On 4/2/14 00:08 , Robert Sander wrote:
> Hi,
>
> what are the options to consistently backup and restore
> data out of a ceph cluster?
>
> - RBDs can be snapshotted.
> - Data on RBDs used inside VMs can be backed up using tools from the guest.
> - CephFS data can be backed up using rsync are similar tools
>
> What about object data in other pools?
>
> There are two scenarios where a backup is needed:
>
> - disaster recovery, i.e. the while cluster goes nuts
> - single item restore, because PEBKAC or application error
>
> Is there any work on progress to cover these?
>
> Regards
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140402/06f612fd/attachment.htm>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux