Re: CEPH backup strategy and best practices

David <david@xxxxxxxxxx> · Sun, 4 Jun 2017 23:59:55 +0200

4 juni 2017 kl. 23:23 skrev Roger Brown <rogerpbrown@xxxxxxxxx>:

I'm a n00b myself, but I'll go on record with my understanding.

On Sun, Jun 4, 2017 at 3:03 PM Benoit GEORGELIN - yulPa <benoit.georgelin@xxxxxxxx> wrote:
Hi ceph users, 

Ceph have a very good documentation about technical usage, but there is a lot of conceptual things missing (from my point of view) 
It's not easy to understand all at the same time, but yes, little by little it's working. 

Here are some questions about ceph , hope someone can take a little time to point me where I can find answers :

 - Backup  :
Do you backup data from a CEPH cluster or you consider a copy as a backup of that file ? 
Let's say I have replica size of 3 . Somehow , my crush map will keep 2 copy in my main rack and 1 copy to another rack in another datacenter 
Can I consider the third copy as a backup ? What would be your position ? 

Replicas are not backups. Just ask GitLab after accidental deletion. source: https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/

- Writing process of ceph object storage using radosgw
Simple question, but not sure about it. 
The more replica the more slower will be my cluster ? Does CEPH have to acknowledge  the number of replica before saying it's good ? 
From what I read, CEPH will write and acknowledge the de primary OSD of the pool , So if that the cas, I does not matter how many replica I want and how far are situated the others OSD that would work the same. 
Can I chose myseft the primary OSD in my zone 1 ,  have a copy on zone 2 (same rack) and a third zone 3 in another datacenter that might have some latency . 

More replicas make slower cluster because it waits for all devices to acknowledge write before reporting back. source: ?

I’d say stick with 3 replicas in one DC, then if you want to add another DC for better data protection (note, not backup), you’ll just add asynchronous mirroring between DCs (http://docs.ceph.com/docs/master/rbd/rbd-mirroring/) with another cluster there.
That way you’ll have a quick cluster (especially if you use awesome disks like NVME SSD journals + SSD storage or better) with a location redundancy.

- Data persistance / availability 
If crush map is by hosts and I have 3 hosts with replication of 3 
This means , I will have 1 copy on each hosts
Does it means I can lost 2 hosts and still have my cluster working, at least on read mode ? and eventually in write too if i say , osd pool default min size = 1

Yes, I think. But best practice is to have at least 5 hosts (N+2) so you can lose 2 hosts and still keep 3 replicas.

Keep in mind that you ”should” have enough storage free as well to be able to loose 2 nodes. If you fill 5 nodes to 80% and loose 2 nodes you won’t be able to repair it all until you get them up and running again.

Thanks for your help. 
- 

Benoît G
_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Roger

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com