Re: S3 and RBD backup

Sanjeev Jha <sanjeev_mac@xxxxxxxxxxx> · Wed, 18 May 2022 20:33:45 +0000

Thanks Janne for the information in detail.

We have RHCS 4.2 non-collocated setup in one DC only. There are few RBD volumes mapped to MariaDB Database.
Also, S3 endpoint with bucket is being used to upload objects. There is no multisite zone has been implemented yet.
My Requirement is to take backup of RBD images and database.
How can S3 bucket backup and restore be possible?
We are looking for many opensource tool like rclone for S3 and Benji for RBD but not able to make sure whether these tools would be enough to achieve backup goal.
Your suggestion based on the above case would be much appreciated.

Best,
Sanjeev

________________________________
From: Janne Johansson <icepic.dz@xxxxxxxxx>
Sent: Tuesday, May 17, 2022 1:01 PM
To: Sanjeev Jha <sanjeev_mac@xxxxxxxxxxx>
Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx>
Subject: Re:  S3 and RBD backup

Den mån 16 maj 2022 kl 13:41 skrev Sanjeev Jha <sanjeev_mac@xxxxxxxxxxx>:
> Could someone please let me know how to take S3 and RBD backup from Ceph side and possibility to take backup from Client/user side?
> Which tool should I use for the backup?

Backing data up, or replicating it is a choice between a lot of
variables and options, and choosing something that has the least
negative effects for your own environment and your own demands. Some
options will cause a lot of network traffic, others will use a lot of
CPU somewhere, others will waste disk on the destination for
performance reasons and some will have long and complicated restore
procedures. Some will be realtime copies but those might put extra
load on the cluster while running, others will be asynchronous but
might need a database at all times to keep track of what not to copy
because it is already at the destination. Some synchronous options
might even cause writes to be slower in order to guarantee that ALL
copies are in place before sending clients an ACK, some will not and
those might lose data that the client thought was delivered 100% ok.

Without knowing what your demands are, or knowing what situation and
environment you are in, it will be almost impossible to match the
above into something that is good for you.
Some might have a monetary cost, some may require a complete second
cluster of equal size, some might have a cost in terms of setup work
from clueful ceph admins that will take a certain amount of time and
effort. Some options might require clients to change how they write
data into the cluster in order to help the backup/replication system.

There is unfortunately not a single best choice for all clusters,
there might even not exist a good option just to cover both S3 and RBD
since they are inherently very different.
RBD will almost certainly be only full restores of a large complete
image, S3 users might want to have the object
foo/bar/MyImportantWriting.doc from last wednesday back only and not
revert the whole bucket or the whole S3 setup.

I'm quite certain that there will not be a single
cheap,fast,efficient,scalable,unnoticeable,easy solution that solves
all these problems at once, but rather you will have to focus on what
the toughest limitations are (money, time, disk, rackspace, network
capacity, client and IO demands?) and look for solutions (or products)
that work well with those restrictions.

--
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx