[PATCH v3 3/9] finalize_object_file(): implement collision check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We've had "FIXME!!! Collision check here ?" in finalize_object_file()
since aac1794132 (Improve sha1 object file writing., 2005-05-03). That
is, when we try to write a file with the same name, we assume the
on-disk contents are the same and blindly throw away the new copy.

One of the reasons we never implemented this is because the files it
moves are all named after the cryptographic hash of their contents
(either loose objects, or packs which have their hash in the name these
days). So we are unlikely to see such a collision by accident. And even
though there are weaknesses in sha1, we assume they are mitigated by our
use of sha1dc.

So while it's a theoretical concern now, it hasn't been a priority.
However, if we start using weaker hashes for pack checksums and names,
this will become a practical concern. So in preparation, let's actually
implement a byte-for-byte collision check.

The new check will cause the write of new differing content to be a
failure, rather than a silent noop, and we'll retain the temporary file
on disk.

Note that this may cause some extra computation when the files are in
fact identical, but this should happen rarely. For example, when writing
a loose object, we compute the object id first, then check manually for
an existing object (so that we can freshen its timestamp) before
actually trying to write and link the new data.

Co-authored-by: Jeff King <peff@xxxxxxxx>
Signed-off-by: Jeff King <peff@xxxxxxxx>
Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
---
 object-file.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/object-file.c b/object-file.c
index 54a82a5f7a0..85f91516429 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1899,6 +1899,57 @@ static void write_object_file_prepare_literally(const struct git_hash_algo *algo
 	hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
 }
 
+static int check_collision(const char *filename_a, const char *filename_b)
+{
+	char buf_a[4096], buf_b[4096];
+	int fd_a = -1, fd_b = -1;
+	int ret = 0;
+
+	fd_a = open(filename_a, O_RDONLY);
+	if (fd_a < 0) {
+		ret = error_errno(_("unable to open %s"), filename_a);
+		goto out;
+	}
+
+	fd_b = open(filename_b, O_RDONLY);
+	if (fd_b < 0) {
+		ret = error_errno(_("unable to open %s"), filename_b);
+		goto out;
+	}
+
+	while (1) {
+		ssize_t sz_a, sz_b;
+
+		sz_a = read_in_full(fd_a, buf_a, sizeof(buf_a));
+		if (sz_a < 0) {
+			ret = error_errno(_("unable to read %s"), filename_a);
+			goto out;
+		}
+
+		sz_b = read_in_full(fd_b, buf_b, sizeof(buf_b));
+		if (sz_b < 0) {
+			ret = error_errno(_("unable to read %s"), filename_b);
+			goto out;
+		}
+
+		if (sz_a != sz_b || memcmp(buf_a, buf_b, sz_a)) {
+			ret = error(_("files '%s' and '%s' differ in contents"),
+				    filename_a, filename_b);
+			goto out;
+		}
+
+		if (sz_a < sizeof(buf_a))
+			break;
+	}
+
+out:
+	if (fd_a > -1)
+		close(fd_a);
+	if (fd_b > -1)
+		close(fd_b);
+	return ret;
+}
+
 /*
  * Move the just written object into its final resting place.
  */
@@ -1941,8 +1992,8 @@ int finalize_object_file(const char *tmpfile, const char *filename)
 			errno = saved_errno;
 			return error_errno(_("unable to write file %s"), filename);
 		}
-		/* FIXME!!! Collision check here ? */
-		unlink_or_warn(tmpfile);
+		if (check_collision(tmpfile, filename))
+			return -1;
 	}
 
 out:
-- 
2.46.0.430.gca674632b70





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux