On Sun, May 17, 2015 at 5:08 PM, Francois Lafont <flafdivers at free.fr> wrote: > Hi, > > Wido den Hollander wrote: > >> Aren't snapshots something that should protect you against removal? IF >> snapshots work properly in CephFS you could create a snapshot every hour. > > Are you talking about the .snap/ directory in a cephfs directory? > If yes, does it work well? Because, with Hammer, if I want to enable > this feature: > > ~# ceph mds set allow_new_snaps true > Error EPERM: Snapshots are unstable and will probably break your FS! > Set to --yes-i-really-mean-it if you are sure you want to enable them > > I have never tried with the --yes-i-really-mean-it option. The warning > is not very encouraging. ;) Heh. Snapshots are probably more stable in Hammer than they've ever been before ? but they are still more likely to break your filesystem than anything else you can do (except possibly for enabling multiple active MDSes?). We added these warnings and some bits to the map telling us if they've ever been enabled to make it clear to the user that there's a risk, and so that we know the world we're in when looking at bugs. :) > >> With the recursive statistics [0] of CephFS you could "easily" backup >> all your data to a different Ceph system or anything not Ceph. > > What is the link between this (very interesting) recursive statistics > feature and the backup? I'm not sure to understand. Can you explain me? > Maybe you test if the size of a directory has changed? > >> I've done this with a ~700TB CephFS cluster and that is still working >> properly. >> >> Wido >> >> [0]: >> http://blog.widodh.nl/2015/04/playing-with-cephfs-recursive-statistics/ > > Thanks Wido for this very interesting (and very simple) feature. > But does it work well? Because, I use Hammer in a Ubuntu Trusty > cluster nodes, and in a Ubuntu Trusty client with 3.16 kernel > and cephfs mounted with the kernel module client, I have this: > > ~# mount | grep cephfs # /mnt is my mounted cephfs > 10.0.2.150,10.0.2.151,10.0.2.152:/ on /mnt type ceph (noacl,name=cephfs,key=client.cephfs) > > ~# ls -lah /mnt/dir1/ > total 0 > drwxr-xr-x 1 root root 96M May 12 21:06 . > drwxr-xr-x 1 root root 103M May 17 23:56 .. > drwxr-xr-x 1 root root 96M May 12 21:06 8 > drwxr-xr-x 1 root root 4.0M May 17 23:57 test > > As you can see: > /mnt/dir1/8/ => 96M > /mnt/dir1/test/ => 4.0M > > But: > /mnt/dir1/ (ie .) => 96M > > I should have: > > size("/mnt/dir1/") = size("/mnt/dir1/8/") + size("/mnt/dir1/test/") > > and this is not the case. Is it normal? That's not something I've seen before, but I have a suspicion that this is because we needed to change one of the block sizes to 4MB in our kernel interfaces. It's a bit distressing though if this is common everywhere... -Greg