[PATCH 7/8] core.fsync: new option to harden loose references

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When writing loose references to disk we first create a lockfile, write
the updated value of the reference into that lockfile, and on commit we
rename the file into place. According to filesystem developers, this
behaviour is broken because applications should always sync data to disk
before doing the final rename to ensure data consistency [1][2][3]. If
applications fail to do this correctly, a hard crash of the machine can
easily result in corrupted on-disk data.

This kind of corruption can in fact be easily observed with Git when the
machine hard-crashes shortly after writing loose references to disk. On
machines with ext4, this will likely lead to the "empty files" problem:
the file has been renamed, but its data has not been synced to disk. The
result is that the references is corrupt, and in the worst case this can
lead to data loss.

Implement a new option to harden loose references so that users and
admins can avoid this scenario by syncing locked loose references to
disk before we rename them into place.

[1]: https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
[2]: https://btrfs.wiki.kernel.org/index.php/FAQ (What are the crash guarantees of overwrite-by-rename)
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/ext4.rst (see auto_da_alloc)

Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
---
 Documentation/config/core.txt |  2 ++
 cache.h                       |  6 +++++-
 config.c                      |  2 ++
 refs/files-backend.c          | 29 +++++++++++++++++++++++++++++
 4 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 973805e8a9..b67d3c340e 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -564,8 +564,10 @@ core.fsync::
 * `pack-metadata` hardens packfile bitmaps and indexes.
 * `commit-graph` hardens the commit graph file.
 * `index` hardens the index when it is modified.
+* `loose-ref` hardens references modified in the repo in loose-ref form.
 * `objects` is an aggregate option that is equivalent to
   `loose-object,pack`.
+* `refs` is an aggregate option that is equivalent to `loose-ref`.
 * `derived-metadata` is an aggregate option that is equivalent to
   `pack-metadata,commit-graph`.
 * `default` is an aggregate option that is equivalent to
diff --git a/cache.h b/cache.h
index 63a95d1977..b56a56f539 100644
--- a/cache.h
+++ b/cache.h
@@ -1005,11 +1005,14 @@ enum fsync_component {
 	FSYNC_COMPONENT_PACK_METADATA		= 1 << 2,
 	FSYNC_COMPONENT_COMMIT_GRAPH		= 1 << 3,
 	FSYNC_COMPONENT_INDEX			= 1 << 4,
+	FSYNC_COMPONENT_LOOSE_REF		= 1 << 5,
 };
 
 #define FSYNC_COMPONENTS_OBJECTS (FSYNC_COMPONENT_LOOSE_OBJECT | \
 				  FSYNC_COMPONENT_PACK)
 
+#define FSYNC_COMPONENTS_REFS (FSYNC_COMPONENT_LOOSE_REF)
+
 #define FSYNC_COMPONENTS_DERIVED_METADATA (FSYNC_COMPONENT_PACK_METADATA | \
 					   FSYNC_COMPONENT_COMMIT_GRAPH)
 
@@ -1026,7 +1029,8 @@ enum fsync_component {
 			      FSYNC_COMPONENT_PACK | \
 			      FSYNC_COMPONENT_PACK_METADATA | \
 			      FSYNC_COMPONENT_COMMIT_GRAPH | \
-			      FSYNC_COMPONENT_INDEX)
+			      FSYNC_COMPONENT_INDEX | \
+			      FSYNC_COMPONENT_LOOSE_REF)
 
 /*
  * A bitmask indicating which components of the repo should be fsynced.
diff --git a/config.c b/config.c
index f03f27c3de..b5d3e6e404 100644
--- a/config.c
+++ b/config.c
@@ -1332,7 +1332,9 @@ static const struct fsync_component_entry {
 	{ "pack-metadata", FSYNC_COMPONENT_PACK_METADATA },
 	{ "commit-graph", FSYNC_COMPONENT_COMMIT_GRAPH },
 	{ "index", FSYNC_COMPONENT_INDEX },
+	{ "loose-ref", FSYNC_COMPONENT_LOOSE_REF },
 	{ "objects", FSYNC_COMPONENTS_OBJECTS },
+	{ "refs", FSYNC_COMPONENTS_REFS },
 	{ "derived-metadata", FSYNC_COMPONENTS_DERIVED_METADATA },
 	{ "default", FSYNC_COMPONENTS_DEFAULT },
 	{ "committed", FSYNC_COMPONENTS_COMMITTED },
diff --git a/refs/files-backend.c b/refs/files-backend.c
index f59589d6cc..279316de45 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1392,6 +1392,15 @@ static int refs_rename_ref_available(struct ref_store *refs,
 	return ok;
 }
 
+static int files_sync_loose_ref(struct ref_lock *lock, struct strbuf *err)
+{
+	int ret = fsync_component(FSYNC_COMPONENT_LOOSE_REF, get_lock_file_fd(&lock->lk));
+	if (ret)
+		strbuf_addf(err, "could not sync loose ref '%s': %s", lock->ref_name,
+			    strerror(errno));
+	return ret;
+}
+
 static int files_copy_or_rename_ref(struct ref_store *ref_store,
 			    const char *oldrefname, const char *newrefname,
 			    const char *logmsg, int copy)
@@ -1504,6 +1513,7 @@ static int files_copy_or_rename_ref(struct ref_store *ref_store,
 	oidcpy(&lock->old_oid, &orig_oid);
 
 	if (write_ref_to_lockfile(lock, &orig_oid, 0, &err) ||
+	    files_sync_loose_ref(lock, &err) ||
 	    commit_ref_update(refs, lock, &orig_oid, logmsg, &err)) {
 		error("unable to write current sha1 into %s: %s", newrefname, err.buf);
 		strbuf_release(&err);
@@ -1524,6 +1534,7 @@ static int files_copy_or_rename_ref(struct ref_store *ref_store,
 	flag = log_all_ref_updates;
 	log_all_ref_updates = LOG_REFS_NONE;
 	if (write_ref_to_lockfile(lock, &orig_oid, 0, &err) ||
+	    files_sync_loose_ref(lock, &err) ||
 	    commit_ref_update(refs, lock, &orig_oid, NULL, &err)) {
 		error("unable to write current sha1 into %s: %s", oldrefname, err.buf);
 		strbuf_release(&err);
@@ -2819,6 +2830,24 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 		}
 	}
 
+	/*
+	 * Sync all lockfiles to disk to ensure data consistency. We do this in
+	 * a separate step such that we can sync all modified refs in a single
+	 * step, which may be more efficient on some filesystems.
+	 */
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+		struct ref_lock *lock = update->backend_data;
+
+		if (!(update->flags & REF_NEEDS_COMMIT))
+			continue;
+
+		if (files_sync_loose_ref(lock, err)) {
+			ret  = TRANSACTION_GENERIC_ERROR;
+			goto cleanup;
+		}
+	}
+
 cleanup:
 	free(head_ref);
 	string_list_clear(&affected_refnames, 0);
-- 
2.35.1

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux