Hi Janne, First of all thank you so much for your explanation. Firstly, I would like to tell you that my intension was to find out the backup solution/method available for Ceph and then secondly to think about the backup policy. I think you are referring to the backup policy needed by customer. Below you can see the existing backup schedule and retention policies. Ceph has been implemented recently and we arelooking for backup option as per the current setup for other Infrastructure. This is the customer requirement and needs to be implemented exactly the same for Ceph as well. For existing infra, Currently we have three different schedules for backing up the different systems. These schedules are Daily, Monthly and Archive. Daily schedule consists of a backup every day between Monday and Friday. Netbackup creates Full back up one of those days (chosen by Netbackup to split the load among the days) and the rest of the week days a cumulative incremental backup is created. This reduces any restore to only use the last Full backup and the last cumulative incremental dataset. This allows for faster restores. Data is kept for 4 Weeks. Monthly schedule consists of a full backup created on either a Saturday or a Sunday. Data is kept for 6 Months. Archive schedule relies on user requests for a complete system or specific data set to be backed up. Data is kept for 10 Years. All the schedules use the same approach for backing up data. LTA archive is fully backed up once a week and point in time recoveries are available by using binary log retention of 31 days. Binary logs are being fully backed up to tape on a weekly basis with an incremental cumulative executed daily. Retention time is 8 weeks for all the MySQL based datasets. I hope you have asked me to do homework related to gathering the same details mentioned above. Now, new task is to setup backup for RBD and S3 based on the above requirement and looking for best options to implement it. Please let me know if I have missed anything. Best, Sanjeev ________________________________ From: Janne Johansson <icepic.dz@xxxxxxxxx> Sent: Thursday, May 19, 2022 1:15 PM To: Sanjeev Jha <sanjeev_mac@xxxxxxxxxxx> Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx> Subject: Re: S3 and RBD backup Den ons 18 maj 2022 kl 22:33 skrev Sanjeev Jha <sanjeev_mac@xxxxxxxxxxx>: > > Thanks Janne for the information in detail. > > We have RHCS 4.2 non-collocated setup in one DC only. There are few RBD volumes mapped to MariaDB Database. > Also, S3 endpoint with bucket is being used to upload objects. There is no multisite zone has been implemented yet. > My Requirement is to take backup of RBD images and database. > How can S3 bucket backup and restore be possible? > We are looking for many opensource tool like rclone for S3 and Benji for RBD but not able to make sure whether these tools would be enough to achieve backup goal. > Your suggestion based on the above case would be much appreciated. Unfortunately, your reply mostly repeated that you want to backup RBD and S3, and the OS version in use and one of the programs the clients run, but almost nothing on the various dimensions I listed (and there are probably more than those I managed to think of when I wrote it) which would affect the actual choices you have. When my customers state "I want backup", we need to ask if they mean "crash recovery", "having one copy of the last 30,90,365 days" or "archive a single copy of some data for 1-5-10-15 years for legal reasons". Those three scenarios are all VERY different, and handled in very different ways. You can't just take a 10-year-archive solution and hope that it will work fine for "my live payment database just crashed, can we be up again in 5 minutes?" cases. So if you don't know now what you want or need, chances are very low that any suggestion you receive will be correct. You can spend months working on the wrong solution, trying to change it for the actual problems you realize you have, when doing some design work beforehand will save you lots of time. If you look at the other answers on this thread, you see choices made like "we do x,y,z to move certain data over to our existing backup program because we can't afford a full replica site". There, someone knew a replication site would be a decent option, but cost or space or some other constraint made that option to not be preferred compared to rigging an export translation in order to utilize the not-optimal but existing backup framework already in use and invested in. That is what I tried to describe in my first reply. You must figure out what is and isn't important, what is and isn't possible in your environment in terms of cpu,network,disk,license-costs, if clients must stop accessing while backups go or not, if best-effort copies are good enough or if your clients will be angry at you at restore time for backing up an SQL db while writes were in-flight so it isn't easily restartable or consistent. Backups of moving data is hard, and it has become harder over the last 50+ years because those who have worked with storage and backups have seen all the ways restores fail. At first people would think backups are just copying a file from A to B, often manually. Then users "forgot" to do that, so some scripts had to run at intervals so the computer would remember to do it every X hours. Then any file that was open would not get a decent backup for reasons. Then servers needed backup, and they started running databases that would never be still (if the company gets to choose) and those were even more important to backup. Then you would get a night time window to make backups, perhaps between 02-03. Then data grew like it always does, so 1h was not enough. So you need to either not-backup or stop backing at 03 or try to figure out a more efficient way to read, transfer, store data so backups go fast again even at larger size. This eats resources at both ends, and hence backup servers need more capacity to handle many CPU intensive backups at 02-03. You start doing incremental backups. That is super nice, except now restores take lots more time. Those almost-an-hour long backup jobs that only send new data over to the nightly backup still need to be rebuilt into the total of all the differentials back to the last full backup. Restores will then not take "an hour" just because the nightly used to take an hour. Suddenly you need faster net, faster tapes/drives, but boss wants to spend company money on sailboat racing adverts instead of expensive backup hardware that isn't making a profit for the company, so whatever solution you want needs to cost nothing but still have capacity for 90 daily backups och full data copies that restore in an instant. All of those can't be fulfilled at the same time, so you have to do your homework and figure out which demands are super important and which parts are nice to have. All can't be true at the same time. So while I appreciate you ask for help with solutions, there is still that part where you have to do a bit of the homework and figure out your limitations, and not just repeat "I want to backup S3 and RBD". > > Could someone please let me know how to take S3 and RBD backup from Ceph side and possibility to take backup from Client/user side? > > Which tool should I use for the backup? > > Backing data up, or replicating it is a choice between a lot of > variables and options, and choosing something that has the least > negative effects for your own environment and your own demands. Some > options will cause a lot of network traffic, others will use a lot of > CPU somewhere, others will waste disk on the destination for > performance reasons and some will have long and complicated restore > procedures. Some will be realtime copies but those might put extra > load on the cluster while running, others will be asynchronous but > might need a database at all times to keep track of what not to copy > because it is already at the destination. Some synchronous options > might even cause writes to be slower in order to guarantee that ALL > copies are in place before sending clients an ACK, some will not and > those might lose data that the client thought was delivered 100% ok. > > Without knowing what your demands are, or knowing what situation and > environment you are in, it will be almost impossible to match the > above into something that is good for you. > Some might have a monetary cost, some may require a complete second > cluster of equal size, some might have a cost in terms of setup work > from clueful ceph admins that will take a certain amount of time and > effort. Some options might require clients to change how they write > data into the cluster in order to help the backup/replication system. > > There is unfortunately not a single best choice for all clusters, > there might even not exist a good option just to cover both S3 and RBD > since they are inherently very different. > RBD will almost certainly be only full restores of a large complete > image, S3 users might want to have the object > foo/bar/MyImportantWriting.doc from last wednesday back only and not > revert the whole bucket or the whole S3 setup. > > I'm quite certain that there will not be a single > cheap,fast,efficient,scalable,unnoticeable,easy solution that solves > all these problems at once, but rather you will have to focus on what > the toughest limitations are (money, time, disk, rackspace, network > capacity, client and IO demands?) and look for solutions (or products) > that work well with those restrictions. > > -- > May the most significant bit of your life be positive. -- May the most significant bit of your life be positive. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx