Re: XFS reflink copy to different filesystem performance question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2022-03-16 1:33, Dave Chinner wrote:

Yeah, Veeam appears to use the shared data extent functionality in
XFS for deduplication and cloning. reflink is the use facing name
for space efficient file cloning (via cp --reflink).

I read bits and pieces about cp --reflink, I guess using that would be
a more "standard" *nix way of using dedupe? For example cp --reflink then
using rsync to do a delta sync against the new copy(to get the updates?
Not that I have a need to do this just curious on the workflow.

I'm guessing that you're trying to copy a deduplicated file,
resulting in the same physical blocks being read over and over again
at different file offsets and causing the disks to seek because it's
not physically sequential data.

Thanks for confirming that, it's what I suspected.

[..]

Maybe they are doing that with FIEMAP to resolve deduplicated
regions and caching them, or they have some other infomration in
their backup/deduplication data store that allows them to optimise
the IO. You'll need to actually run things like strace on the copies
to find out exactly what it is doing....

ok thanks for the info. I do see a couple of times there are periods of lots
of disk reads on the source and no writes happening on the destination
I guess it is sorting through what it needs to get, one of those lasted
about 20mins.

No, they don't exist because largely reading a reflinked file
performs no differently to reading a non-shared file.

Good to know, certainly would be nice if there was at least a way to
identify a file as having X number of links.

To do that efficiently (i.e. without a full filesystem scan) you
need to look up the filesystem reverse mapping table to find all the
owners of pointers to a given block.  I bet you didn't make the
filesystem with "-m rmapbt=1" to enable that functionality - nobody
does that unless they have a reason to because it's not enabled by
default (yet).

I'm sure I did not do that either, but I can do that if you think it
would be advantageous. I do plan to ship this DL380Gen10 XFS system to
another location and am happy to reformat the XFS volume with that extra
option if it would be useful.

I don't anticipate needing to deal directly with this reflinked data,
just let Veeam do it's thing. Thanks for clearing things up for
me so quickly!

nate




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux