Hi Amir,
Il 26-02-2018 08:58 Amir Goldstein ha scritto:
Gionatan,
First of all, the answer to your question is "just" faster copy.
reflinkning a file is much faster than copy, but it is not O(1).
I believe cp --reflink can result in cloning part of the file if the
system
crashes mid operation, so in any case, the operation is not *atomic*
in that sense.
But your questions about quiescence the filesystem and your question
about the *atomic* nature of the clone operation are two very different
questions.
can this result on out-of-order writes from the cloned file's point of
view? I mean:
- take a 10-extents file;
- a vm/db/whatever is writing to the file;
- a cp --reflink is executed;
- extents are cloned one-by-one, with extents 1-4 alredy cloned, 5 is in
progress;
- the vm/db writes to extent n.1 - this write will *not* be present on
the cloned file;
- application writes to extent n.6 which will be cloned shortly;
- the cloned file ends with the later write to extent n.6 but not the
previous on extent n.1;
- bad things happen!
If the above is true, than cp --reflink can't be used even for
relaxed-consistency backup/clones.
What you seem to *think* xfs reflink does, it does not actually do.
xfs reflink does NOT reflink the file in-memory data.
xfs reflink "only" reflinks the file on-disk data.
Right now, if you write a large file without fsync and clone it, you
might as well get a clone of unallocated or partly fallocated file with
zero or stale data.
Oh, I absolutely do not expect for reflink/clone to works on in-memory
data. I *surely* expect for dirty, not commited data to be lost: this is
the very reason I wrote about crash-consistent backup.
In short: is cloning/reflink the same as "pulling the plug" for the
cloned file? I mean:
- a successfull clone (so, a non-interruped/crashed one) is akin to an
atomic process for the cloned file;
- async writes/dirty data are lost;
- fsynced writes are preserved;
- writes are not reordered/commited out of order.
Maybe the entire discussion is skewed by the fact that, in some cases, I
am willing to relax my consistency model to include a crash-consistent
backup option. Fact is, in the virtualization world there are many
backup utilities/applications which *use* this model, and I wondered if
a cp --reflink would give similar results without the hassle.
Maybe the entire crash-vs-application consistency is out of place in a
filesystem mailing list, where you (rightfully!!!) strive for
perfect/maximum data consistency (and I *really* appreciate that).
Hoewever, given the recent reflinking works on XFS, I wonder if I can
put this to "good use" when it is considered stable.
Going forward, I think there is an intention to "clone" the file
in-memory
data as well by sharing the READONLY cache pages between cloned files,
but I don't think dirty pages are going be shared between clones
anyway,
so you are back to square one - need to get the data on-disk before
cloning
the file.
Great - I think this would do wonders for cache efficiency...
Cheers,
Amir.
Thanks.
PS: sorry if I rephrase the question in different terms. English is not
my primary language, please bear with me :p
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html