[RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Wed, 23 Mar 2022 15:18:31 +0100

Add a new fsyncMethod.batch.quarantine setting which defaults to
"false". Preceding (RFC, and not meant to flip-flop like that
eventually) commits ripped out the "tmp-objdir" part of the
core.fsyncMethod=batch.

This documentation proposes to keep that as the default for the
reasons discussed in it, while allowing users to set
"fsyncMethod.batch.quarantine=true".

Furthermore update the discussion of "core.fsyncObjectFiles" with
information about what it *really* does, why you probably shouldn't
use it, and how to safely emulate most of what it gave users in the
past in terms of performance benefit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
---
 Documentation/config/core.txt | 80 +++++++++++++++++++++++++++++++----
 1 file changed, 72 insertions(+), 8 deletions(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index f598925b597..365a12dc7ae 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -607,21 +607,85 @@ stored on NTFS or ReFS filesystems.
 +
 The `batch` is currently only applies to loose-object files and will
 kick in when using the linkgit:git-unpack-objects[1] and
-linkgit:update-index[1] commands. Note that the "last" file to be
+linkgit:git-update-index[1] commands. Note that the "last" file to be
 synced may be the last object, as in the case of
 linkgit:git-unpack-objects[1], or relevant "index" (or in the future,
 "ref") update, as in the case of linkgit:git-update-index[1]. I.e. the
 batch syncing of the loose objects may be deferred until a subsequent
 fsync() to a file that makes them "active".
 
+fsyncMethod.batch.quarantine::
+	A boolean which if set to `true` will cause "batched" writes
+	to objects to be "quarantined" if
+	`core.fsyncMethod=batch`. This is `false` by default.
++
+The primary object of these fsync() settings is to protect against
+repository corruption of things which are reachable, i.e. "reachable",
+via references, the index etc. Not merely objects that were present in
+the object store.
++
+Historically setting `core.fsyncObjectFiles=false` assumed that on a
+filesystem with where an fsync() would flush all preceding outstanding
+I/O that we might end up with a corrupt loose object, but that was OK
+as long as no reference referred to it. We'd eventually the corrupt
+object with linkgit:git-gc[1], and linkgit:git-fsck[1] would only
+report it as a minor annoyance
++
+Setting `fsyncMethod.batch.quarantine=true` takes the view that
+something like a corrupt *unreferenced* loose object in the object
+store is something we'd like to avoid, at the cost of reduced
+performance when using `core.fsyncMethod=batch`.
++
+Currently this uses the same mechanism described in the "QUARANTINE
+ENVIRONMENT" in the linkgit:git-receive-pack[1] documentation, but
+that's subject to change. The performance loss is because we need to
+"stage" the objects in that quarantine environment, fsync() it, and
+once that's done rename() or link() it in-place into the main object
+store, possibly with an fsync() of the index or ref at the end
++
+With `fsyncMethod.batch.quarantine=false` we'll "stage" things in the
+main object store, and then do one fsync() at the very end, either on
+the last object we write, or file (index or ref) that'll make it
+"reachable".
++
+The bad thing about setting this to `true` is lost performance, as
+well as not being able to access the objects as they're written (which
+e.g. consumers of linkgit:git-update-index[1]'s `--verbose` mode might
+want to do).
++
+The good thing is that you should be guaranteed not to get e.g. short
+or otherwise corrupt loose objects if you pull your power cord, in
+practice various git commands deal quite badly with discovering such a
+stray corrupt object (including perhaps assuming it's valid based on
+its existence, or hard dying on an error rather than replacing
+it). Repairing such "unreachable corruption" can require manual
+intervention.
+
 core.fsyncObjectFiles::
-	This boolean will enable 'fsync()' when writing object files.
-	This setting is deprecated. Use core.fsync instead.
-+
-This setting affects data added to the Git repository in loose-object
-form. When set to true, Git will issue an fsync or similar system call
-to flush caches so that loose-objects remain consistent in the face
-of a unclean system shutdown.
+	This boolean will enable 'fsync()' when writing loose object
+	files.
++
+This setting is the historical fsync configuration setting. It's now
+*deprecated*, you should use `core.fsync` instead, perhaps in
+combination with `core.fsyncMethod=batch`.
++
+The `core.fsyncObjectFiles` was initially added based on integrity
+assumptions that early (pre-ext-4) versions of Linux's "ext"
+filesystems provided.
++
+I.e. that a write of file A without an `fsync()` followed by a write
+of file `B` with `fsync()` would implicitly guarantee that `A' would
+be `fsync()`'d by calling `fsync()` on `B`. This asssumption is *not*
+backed up by any standard (e.g. POSIX), but worked in practice on some
+Linux setups.
++
+Nowadays you should almost certainly want to use
+`core.fsync=loose-object` instead in combination with
+`core.fsyncMethod=bulk`, and possibly with
+`fsyncMethod.batch.quarantine=true`, see above. On modern OS's (Linux,
+OSX, Windows) that gives you most of the performance benefit of
+`core.fsyncObjectFiles=false` with all of the safety of the old
+`core.fsyncObjectFiles=true`.
 
 core.preloadIndex::
 	Enable parallel index preload for operations like 'git diff'
-- 
2.35.1.1428.g1c1a0152d61