Re: Any tips for moving to reflink?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01.07.2017 18:38, Darrick J. Wong wrote:
> On Sat, Jul 01, 2017 at 01:41:43PM +0200, Marian Beermann wrote:
>> Hi
>>
>>>> I'm planning to use this reflink feature for instant local snapshots
>>>> and then use my backup software of choice, borg, to keep a long time
>>>> history of my work on a remote server. Since borg stores data in a
>>>> dedup fashion I can also backup the reflink snapshots and they won't
>>>> take additional space. The only drawback is that today borg need to
>>>> hash all the files found in a reflink directory in order to find out
>>>> about dedup blocks. I asked a question on the borg mailing list
>>>> https://github.com/borgbackup/borg/issues/2743 and apparently it
>>>> won't be an issue to add a feature to support XFS in order to
>>>> identify the physical extents. Is rmapbt required for that?
>>>
>>> borgbackup will probably need to call the GETFSMAP ioctl, which won't
>>> land until 4.12.  On xfs, rmapbt is needed to supply data block
>>> ownership info, which is what borgbackup (and bees, and...) say they
>>> want to be smarter about dedup.
>>
>> My understanding so far was that FIEMAP would be sufficient to query the
>> extents associated with a file. Shouldn't this be sufficient to know
>> whether two files on the same file system refer to the same data?
> 
> Not necessarily -- FIEMAP provides physical offset into a device but
> does not actually identify which one, which is a problem on multi-device
> filesystems such as btrfs and XFS.  IIRC btrfs creates a virtual
> physical offset space consisting of all the devices one after the other,
> but then you have to know /that/ mapping too.  GETFSMAP by contrast
> tells you which device and where on that device.
> 

I see. If FIEMAP reports same data, while describing different data,
then it certainly breaks one of the main uses of it (detecting identical
data)?

To clarify the intended;

Borg would essentially hash the output of FIEMAP/GETFSMAP for a given
file and compare this hash with a previous hash.

If the two hashes don't match,
then Borg would re-process the entire file.

It'd be possible to make this more granular, on a per-extent basis

Cheers, Marian
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux