How to backup hundreds or thousands of TB

greg@xxxxxxxxxxx (Gregory Farnum) · Wed, 27 May 2015 13:42:41 -0700

On Sun, May 17, 2015 at 5:08 PM, Francois Lafont <flafdivers at free.fr> wrote:
> Hi,
>
> Wido den Hollander wrote:
>
>> Aren't snapshots something that should protect you against removal? IF
>> snapshots work properly in CephFS you could create a snapshot every hour.
>
> Are you talking about the .snap/ directory in a cephfs directory?
> If yes, does it work well? Because, with Hammer, if I want to enable
> this feature:
>
> ~# ceph mds set allow_new_snaps true
> Error EPERM: Snapshots are unstable and will probably break your FS!
> Set to --yes-i-really-mean-it if you are sure you want to enable them
>
> I have never tried with the --yes-i-really-mean-it option. The warning
> is not very encouraging. ;)

Heh. Snapshots are probably more stable in Hammer than they've ever
been before ? but they are still more likely to break your filesystem
than anything else you can do (except possibly for enabling multiple
active MDSes?). We added these warnings and some bits to the map
telling us if they've ever been enabled to make it clear to the user
that there's a risk, and so that we know the world we're in when
looking at bugs. :)

>
>> With the recursive statistics [0] of CephFS you could "easily" backup
>> all your data to a different Ceph system or anything not Ceph.
>
> What is the link between this (very interesting) recursive statistics
> feature and the backup? I'm not sure to understand. Can you explain me?
> Maybe you test if the size of a directory has changed?
>
>> I've done this with a ~700TB CephFS cluster and that is still working
>> properly.
>>
>> Wido
>>
>> [0]:
>> http://blog.widodh.nl/2015/04/playing-with-cephfs-recursive-statistics/
>
> Thanks Wido for this very interesting (and very simple) feature.
> But does it work well? Because, I use Hammer in a Ubuntu Trusty
> cluster nodes, and in a Ubuntu Trusty client with 3.16 kernel
> and cephfs mounted with the kernel module client, I have this:
>
> ~# mount | grep cephfs # /mnt is my mounted cephfs
> 10.0.2.150,10.0.2.151,10.0.2.152:/ on /mnt type ceph (noacl,name=cephfs,key=client.cephfs)
>
> ~# ls -lah /mnt/dir1/
> total 0
> drwxr-xr-x 1 root root  96M May 12 21:06 .
> drwxr-xr-x 1 root root 103M May 17 23:56 ..
> drwxr-xr-x 1 root root  96M May 12 21:06 8
> drwxr-xr-x 1 root root 4.0M May 17 23:57 test
>
> As you can see:
>   /mnt/dir1/8/      => 96M
>   /mnt/dir1/test/   => 4.0M
>
> But:
>   /mnt/dir1/ (ie .) => 96M
>
> I should have:
>
>     size("/mnt/dir1/") = size("/mnt/dir1/8/") + size("/mnt/dir1/test/")
>
> and this is not the case. Is it normal?

That's not something I've seen before, but I have a suspicion that
this is because we needed to change one of the block sizes to 4MB in
our kernel interfaces. It's a bit distressing though if this is common
everywhere...
-Greg