Re: To backup or not to backup the classic way - How to backup hundreds of TB?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hardware failures are just one possible cause. If you value your data you will have a backup and preferably going to some sort of removable media that can be taken offsite, like those things that everybody keeps saying are dead…..what are they called….oh yeah tapes. J A online copy of your data on some sort of large JBOD or 2nd Ceph cluster is a good idea if you need faster access, but I wouldn’t rely on it for my only backup.

 

There are many things that can cause data loss, failing hardware is just one. As can be seen through many posts on this list, bugs in Ceph or user error is a much more common cause of data loss and triple replication won’t protect you from it. Thought should also be given to malicious actions by internal staff with grievances or external hackers (eg ransomware). In these cases even online backups like rsync…etc, might not protect you as that data can be accessed and deleted at the same time as the live data. I predict these sort of incidents will become more common in the near future.

 

 

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of ????????????, ????????
Sent: 14 February 2017 09:56
To: Götz Reinicke <goetz.reinicke@xxxxxxxxxxxxxxx>
Cc: ceph new <ceph-users@xxxxxxxxxxxxxx>
Subject: Re: [ceph-users] To backup or not to backup the classic way - How to backup hundreds of TB?

 

Hello!

 

  The answer is pretty much depends on your fears. If you afraid of hardware failures you could have more then standard 3 copies, configure your failure domain properly and so on. If you afraid of some big disaster that can hurt all of your hardware - you could consider making an async replica to a cluster in an another datacenter on another content. If you afraid of some kind of cluster software issues - then you can build an another cluster and use third-party tools to backup data there, but as you correctly noticed it will not be too convenient.

 

As a common sollution I would offer you to use the same cluster for backups as well (may be just a different pool\OSD tree with less expensive drives) - in most cases it's enough.


Best regards,

Vladimir

 

2017-02-14 14:15 GMT+05:00 Götz Reinicke <goetz.reinicke@xxxxxxxxxxxxxxx>:

Hi,

I guess that's a question that pops up in different places, but I could not find any which fits to my thoughts.

Currently we start to use ceph for file shares of our films produced by our students and some xen/vmware VMs. Thd VM data is already backed up; the fils original footage is stored in other places.

We start with some 100TB rbd and mount smb/NFS shares from the clients. May be we look into ceph fs soon.

The question is: How would someone handle a backup of 100 TB data? Rsyncing that to an other system or having a commercial backup solution looks not that good e.g. regarding the price.

One thought is, is there some sort of best practice in the ceph world e.g. replicating to an other physical independent cluster? Or use more replicas, odds, nodes and do snapshots in one cluster?

Having productive data and backup on the same hardware currently makes me feel not that good too….But the world changes :)

Long story short: How do you do backup hundreds of TB?

        Curious for suggestions and thoughts .. Thanks and Regards . Götz


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux