On Tue, Dec 03, 2019 at 02:09:30PM -0500, Jeff Layton wrote: > On Tue, 2019-12-03 at 07:59 -0800, Robert LeBlanc wrote: > > On Thu, Nov 14, 2019 at 11:48 AM Sage Weil <sage@xxxxxxxxxxxx> wrote: > > > On Thu, 14 Nov 2019, Patrick Donnelly wrote: > > > > On Wed, Nov 13, 2019 at 6:36 PM Jerry Lee <leisurelysw24@xxxxxxxxx> wrote: > > > > > > > > > > On Thu, 14 Nov 2019 at 07:07, Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Wed, Nov 13, 2019 at 2:30 AM Jerry Lee <leisurelysw24@xxxxxxxxx> wrote: > > > > > > > Recently, I'm evaluating the snpahsot feature of CephFS from kernel > > > > > > > client and everthing works like a charm. But, it seems that reverting > > > > > > > a snapshot is not available currently. Is there some reason or > > > > > > > technical limitation that the feature is not provided? Any insights > > > > > > > or ideas are appreciated. > > > > > > > > > > > > Please provide more information about what you tried to do (commands > > > > > > run) and how it surprised you. > > > > > > > > > > The thing I would like to do is to rollback a snapped directory to a > > > > > previous version of snapshot. It looks like the operation can be done > > > > > by over-writting all the current version of files/directories from a > > > > > previous snapshot via cp. But cp may take lots of time when there are > > > > > many files and directories in the target directory. Is there any > > > > > possibility to achieve the goal much faster from the CephFS internal > > > > > via command like "ceph fs <cephfs_name> <dir> snap rollback > > > > > <snapname>" (just a example)? Thank you! > > > > > > > > RADOS doesn't support rollback of snapshots so it needs to be done > > > > manually. The best tool to do this would probably be rsync of the > > > > .snap directory with appropriate options including deletion of files > > > > that do not exist in the source (snapshot). > > > > > > rsync is the best bet now, yeah. > > > > > > RADOS does have a rollback operation that uses clone where it can, but > > > it's a per-object operation, so something still needs to walk the > > > hierarchy and roll back each file's content. The MDS could do this more > > > efficiently than rsync give what it knows about the snapped inodes > > > (skipping untouched inodes or, eventually, entire subtrees) but it's a > > > non-trivial amount of work to implement. > > > > > > > Would it make sense to extend CephFS to leverage reflinks for cases like this? That could be faster than rsync and more space efficient. It would require some development time though. > > > > I think reflink would be hard. Ceph hardcodes the inode number into the > object name of the backing objects, so sharing between different inode > numbers is really difficult to do. It could be done, but it means a new > in-object-store layout scheme. > > That said...I wonder if we could get better performance by just > converting rsync to use copy_file_range in this situation. That has the > potential to offload a lot of the actual copying work to the OSDs. Just to add my 2 cents, I haven't done any serious performance measurements with copy_file_range. However, the very limited observations I've done surprised me a bit, showing that performance isn't great. In fact, when file objects size is small, using copy_file_range seems to be slower than a full read+write cycle. It's still on my TODO list to do some more serious performance analysis and figure out why. It didn't seemed to be an issue on the client side, but I don't really have any real evidences. Once the COPY_FROM2 operation is stable, I can plan to spend some time on this. Cheers, -- Luís _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com