Re: git behaviour question regarding SHA-1 and commits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:

> This is not something you have to worry about, just get on with using
> Git and stop worrying about phenomenally unlikely edge cases that are
> never going to happen.

People who repeated answers along this line, you can stop. The message has
been heard, but without answering the original question.

When we create a new object (i.e. "git add" to register a new blob
contents, "git commit" that internally generates new tree objects to
record updated "whole contents" and then records the commit object), we
first compute what the object name of the new object would be, and then
check if we already have an object with the same object name in the object
store. If we do, we do not write the new copy of the object out (see the
function write_sha1_file() in sha1_file.c and the call to has_sha1_file()
that bypasses write_loose_object()).

So the old contents will be kept without getting overwritten.

Which sounds nice, but it has interesting consequences, as we do not
bother running byte-for-byte comparison when we find what we tried to
write already existed in the object store in order to error out in fear of
the miniscule chance that we would hit a SHA-1 collision.

If the collision is between commit objects, for example, we would write
the (old) commit object name to the tip of the current branch. Most
likely, the tree object recorded in the (old) commit would not match the
tree object your "git commit" wanted to record (otherwise you have hit
SHA-1 collision twice in a row ;-), which would mean "git status" would
show that a whole bunch of paths have changed between the HEAD and the
index. Also "git log" would show the history leading to the (old) commit
that is likely to be very different from what you would expect immediately
after committing the collided commit. Of course, you could recover from it
with "git reset --soft" after finding out what the previous HEAD was from
the reflog, but it won't be a pleasant experience.

There can be other kinds of collisions (e.g. your latest commit might have
collided with an existing blob or tree, in which case it is likely that
almost nothing would work after finding a blob or tree in HEAD).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]