Re: Reflink (cow) copy of busy files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Amir,

Il 26-02-2018 08:58 Amir Goldstein ha scritto:

Gionatan,

First of all, the answer to your question is "just" faster copy.
reflinkning a file is much faster than copy, but it is not O(1).
I believe cp --reflink can result in cloning part of the file if the system
crashes mid operation, so in any case, the operation is not *atomic*
in that sense.

But your questions about quiescence the filesystem and your question
about the *atomic* nature of the clone operation are two very different
questions.

can this result on out-of-order writes from the cloned file's point of view? I mean:
- take a 10-extents file;
- a vm/db/whatever is writing to the file;
- a cp --reflink is executed;
- extents are cloned one-by-one, with extents 1-4 alredy cloned, 5 is in progress; - the vm/db writes to extent n.1 - this write will *not* be present on the cloned file;
- application writes to extent n.6 which will be cloned shortly;
- the cloned file ends with the later write to extent n.6 but not the previous on extent n.1;
- bad things happen!

If the above is true, than cp --reflink can't be used even for relaxed-consistency backup/clones.

What you seem to *think* xfs reflink does, it does not actually do.
xfs reflink does NOT reflink the file in-memory data.
xfs reflink "only" reflinks the file on-disk data.
Right now, if you write a large file without fsync and clone it, you
might as well get a clone of unallocated or partly fallocated file with
zero or stale data.

Oh, I absolutely do not expect for reflink/clone to works on in-memory data. I *surely* expect for dirty, not commited data to be lost: this is the very reason I wrote about crash-consistent backup.

In short: is cloning/reflink the same as "pulling the plug" for the cloned file? I mean: - a successfull clone (so, a non-interruped/crashed one) is akin to an atomic process for the cloned file;
- async writes/dirty data are lost;
- fsynced writes are preserved;
- writes are not reordered/commited out of order.

Maybe the entire discussion is skewed by the fact that, in some cases, I am willing to relax my consistency model to include a crash-consistent backup option. Fact is, in the virtualization world there are many backup utilities/applications which *use* this model, and I wondered if a cp --reflink would give similar results without the hassle.

Maybe the entire crash-vs-application consistency is out of place in a filesystem mailing list, where you (rightfully!!!) strive for perfect/maximum data consistency (and I *really* appreciate that). Hoewever, given the recent reflinking works on XFS, I wonder if I can put this to "good use" when it is considered stable.

Going forward, I think there is an intention to "clone" the file in-memory
data as well by sharing the READONLY cache pages between cloned files,
but I don't think dirty pages are going be shared between clones anyway, so you are back to square one - need to get the data on-disk before cloning
the file.

Great - I think this would do wonders for cache efficiency...


Cheers,
Amir.

Thanks.

PS: sorry if I rephrase the question in different terms. English is not my primary language, please bear with me :p

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux