On Thu, Nov 02, 2023 at 09:39:53AM -0700, Darrick J. Wong wrote: > On Thu, Nov 02, 2023 at 01:42:54PM +0100, Alexander Puchmayr wrote: > > Hi there, > > > > I just encountered a problem when trying to use xfsdump on a filesystem with > > lots of reflink copied vm disk images, yielding a dump file much larger than > > expected and which I also was unable to restore from (target disk full). I > > created a gentoo bug item under https://bugs.gentoo.org/916704 and I got > > advised to report it here as well. > > > > Copy from the bug report: > > > > sys-fs/xfsdump-3.1.12 seems to copy reflink copied files as ordinary files, > > resulting in a way too big dump file. Restoring from such a dump yields likely > > a out-of-diskspace condition. > > Correct, xfsdump (and tar, and rsync...) does not know how to preserve > the sharing factor of a particular space extent. All of those tools > walk the inodes on a filesystem, open them, and read() out the data. > > Although there are ways to find out which file(s) own a piece of disk > space, each of those tools would most likely require a thorough redesign > to the dump file format to allow pointing to shared blocks elsewhere in > the dump file. I don't think that is the case. Like XFS, xfsdump encodes user data it backs up in extent records, and it has different types of extents. It currently understands "data" and "hole" extents as returned by XFS_IOC_GETBMAPX, so we could extend that to encode "shared" extents that point to an offset and length in a different inode. Yes, this means during the scan we have to record all shared extents with their underlying block number, then after the scan we need to resolve that to the single copy we are going to keep ias a normal data extent in the dump (i.e. the first to be restored) Then we convert all the others to the new shared extent type that points at the {ino, off, len} that contains the actual data in the dump. Now all restore needs to do is run FICLONERANGE when it comes across a shared extent - it's got all the info it needs in the dump to recreated the shared extent. We can use restore side ordering to guarantee that the data we need to clone is already on disk (e.g. delay extent clones until after all the normal data has been restored) so that all the shared extents we restore end up with the correct data in them. Yes, this means we need to bump the dump format version number to support shared extents, but overall it's not a major revision of the format or major surgery to the code base. It doesn't require kernel or even XFS expertise to implement - it's all userspace stuff and fairly straight forward - it just requires time, resources and commitment. > Regardless, nobody's submitted code to do any of those things. Patches > welcome. Yup, that is the biggest issue - there's always more things to do that we have people to do them. > > It may be used as a denial-of-service tool which can be used by an ordinary > > Please do not file a ^^^^^^^^^^^^^^^^^ CVE for this. /me sighs -Dave. -- Dave Chinner david@xxxxxxxxxxxxx