Re: backing up CephFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The other question I would ask myself in this situation is regarding what
and how needs to be backed up.  Borg is excellent, but it is built to shine
with binary deduplication, which requires computational work.  If the data
stream is a lot of small files, it may be easier to rsync them somewhere
else periodically, maybe with a backup (-b) option.  If the data is a
database with lots of obscure files, binary dedup is a good way to go
(taking into account the quiescing process to keep everything consistent
and recoverable).  If you do go the Proxmox route, there is an option to do
KVM dirty bitmapping (loosely similar to VMWare change block tracking) with
Proxmox backup server.

Filesystem snapshots IMO are *not* a good backup method, they are useful
for point in time consistent data, or to have a back-out point for e.g.
revertable changes.

--
Alex Gorbachev
www.iss-integration.com
ISS Storcium



On Sun, Apr 30, 2023 at 12:00 PM Milind Changire <mchangir@xxxxxxxxxx>
wrote:

> On Sun, Apr 30, 2023 at 9:02 PM William Edwards <wedwards@xxxxxxxxxxxxxx>
> wrote:
>
> > Angelo Höngens schreef op 2023-04-30 15:03:
> > > How do you guys backup CephFS? (if at all?)
> > >
> > > I'm building 2 ceph clusters, a primary one and a backup one, and I'm
> > > looking into CephFS as the primary store for research files. CephFS
> > > mirroring seems a very fast and efficient way to copy data to the
> > > backup location, and it has the benefit of the files on the backup
> > > location being fully in a ready-to-use state instead of some binary
> > > proprietary archive.
> > >
> > > But I am wondering how to do 'ransomware protection' in this setup. I
> > > can't believe I'm the only one that wants to secure my data ;)
> > >
> > > I'm reading up on snapshots and mirroring, and that's great to protect
> > > from user error. I could schedule snapshots on the primary cluster,
> > > and they would automatically get synced to the backup cluster.
> > >
> > > But a user can still delete all snapshots on the source side, right?
> > >
> > > And you need to create a ceph user on the backup cluster, and import
> > > that on the primary cluster. That means that if a hacker has those
> > > credentials, he could also delete the data on the backup cluster? Or
> > > is there some 'append-only' mode for immutability?
> > >
> > > Another option I'm looking into is restic. Restic looks like a cool
> > > tool, but it does not support s3 object locks yet. See the discussion
> > > here [1]. I should be able to get immutability working with the
> > > restic-rest backend according to the developer. But I have my worries
> > > that running restic to sync up an 800TB filesystem with millions of
> > > files will be.. worrysome ;) Anyone using restic in production?
> > >
> > > Thanks again for your input!
> >
> > Among others, we mount CephFS's root directory on a machine, and back up
> > that mount using Borg. In our experience, Borg is faster than Restic. I
> > actually open-sourced the library we wrote for Borg yesterday, see:
> > https://github.com/CyberfusionNL/python3-cyberfusion-borg-support
> >
> > >
> > > Angelo.
> > >
> > >
> > >
> > > [1] https://github.com/restic/restic/issues/3195
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> > --
> > With kind regards,
> >
> > William Edwards
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>
> I'm not too aware of Borg's features, but ...
> To add further to this suggestion and to be able to achieve backup with
> consistency, you'll want to:
> 1. quiesce your applications and sync filesystem
> 2. take a cephfs filesystem snapshot
> 3. possibly replicate the snapshot to somewhere else with CephFS MIrror and
> 4. either of
>    4.1 backup the remote filesystem where the snapshot was replicated or
>    4.2 backup the snapshot on the active filesystem
>
>
> --
> Milind
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux