On Tue, Jan 29, 2019 at 12:56:17AM +0200, Amir Goldstein wrote: > > > > What I just described above is actually already implemented with > > > > Overlayfs snapshots [1], but for many applications overlayfs snapshots > > > > it is not a practical solution. > > > > > > > > I have based my assumption that reflink of a large file may incur > > > > lots of metadata updates on my limited knowledge of xfs reflink > > > > implementation, but perhaps it is not the case for other filesystems? > > > > Comparitively speaking: compared to copying a large file, reflink is > > cheap on any filesystem that implements it. Sure, reflinking on XFS > > is CPU limited, IIRC, to ~10-20,000 extents per second per reflink > > op per AG, but it's still faster than copying 10-20,000 extents > > per second per copy op on all but the very fastest, unloaded nvme > > SSDs... > > > > I think the concern is the added metadata load on the rest of the > users. Backup app doesn't care about the time it consumes to clone > before backup. But this concern is not based on actual numbers. So what is it based on? > > Really, though, for this use case it's make more sense to have "per > > file freeze" semantics. i.e. if you want a consistent backup image > > on snapshot capable storage, the process is usually "freeze > > filesystem, snapshot fs, unfreeze fs, do backup from snapshot, > > remove snapshot". We can already transparently block incoming > > writes/modifications on files via the freeze mechanism, so why not > > just extend that to per-file granularity so writes to the "very > > large read-mostly file" block while it's being backed up.... > > > > Indeed, this would probably only require a simple extension to > > FIFREEZE/FITHAW - the parameter is currently ignored, but as defined > > by XFS it was a "freeze level". Set this to 0xffffffff and then it > > freezes just the fd passed in, not the whole filesystem. > > Alternatively, FI_FREEZE_FILE/FI_THAW_FILE is simple to define... > > > > I think it's a good idea to add file freeze semantics to the toolbox > of useful things that could be accomplished with reflink. reflink is already atomic w.r.t. other writes - in what way does a "file freeze" have any impact on a reflink operation? that is, apart from preventing it from being done, because reflink can modify the source inode on XFS, too.... > Especially with your plans for subvolumes as files > How is that coming along by the way?. If I didn't have to spend so much time fire-fighting broken stuff, I might make more progress. > Anyway, freeze semantics alone won't work for our backup application > that needs to be non intrusive. Even if writes to large file are few, > backup may take time, so blocking those few write for that long is > not acceptable. So, reflink is too expensive because there are only occasional writes, but blocking that occasional write is too expensive, too, even though it is rare? > Blocking the writes for the setup time of a reflink > is exactly what I was proposing and in your analogy, No, I proposed a way to provide a -point in time snapshot- of a file that doesn't require reflink or any other special filesystem support. > the block > device is frozen only for a short period of time for setting up the > snapshot and not for the duration of the backup. Right, it's frozen for as long as it takes to set up a -point in time snapshot- that the backup can be taken from. You don't need that to reflink a file. You need it if you want to do something other than a reflink.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx