Re: git-svn and huge data and modifying the git-svn-HEAD branch directly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds <torvalds@xxxxxxxx> writes:

> But if somebody does the get_sha1() magic, and Junio agrees, then I think 
> it would be a great thing to do.

I am inclined to agree here.

Some caveats upfront, though.

Since I was bitten at least once by attempting get_sha1() to
deal with ambiguous names (the issue was between heads and tags
but I think there are similar issues here) I am really reluctant
to have the function look at anywhere other than heads/ and
tags/ without explicit prefix.

Currently get_sha1_basic() says:

	* look in $GIT_DIR with these prefixes in turn and take
          the first match: "", "refs", "refs/tags", "refs/heads".

The extended one _would_ in addition say one of these things:

	* if none of the above prefixes work, try other
          directories under refs/ as prefixes and take the first
          match.

	or

	* if none of the above prefixes work, try other
          directories under refs/ as prefixes and if there is a
          unique match take it.  If there are more than one
          match, do not take either.

In the context of get_sha1(), get_sha1_basic() is used like
this:

	* if get_sha1_basic() finds an answer, use it.
          Otherwise see if it is an abbreviated object name.

The behaviour of a naive implementation of the former would
depend on readdir() and traversal order, which makes (from the
end user's point of view) a hard to understand confusion that is
not reproducible.  Another repository cloned from such would
even give you different answers.

The latter at first sounds sane, but it has a subtle issue,
which was what bitten me previously between heads/ and tags/.
In that broken version, if you have a head called "dead" and a
tag with the same name, neither was taken ("they are not unique,
so do not take either!") and we ended up finding an object whose
SHA1 name began with those two bytes 0xDE 0xAD.  I do not think
this has happened in the field, fortunately, but it would have
been quite hard to diagnose.

So if we were to do it, I would say do the latter, but be very
careful to make sure you fail the whole get_sha1() when you bail
out of the "try possible prefixes" codepath because of
ambiguity.  There may be other issues involved, but I wouldn't
know -- I reverted the "do not take either if they are
ambiguous between heads/ and tags/" patch primarily because of
the reason from the above paragraph, but also did not want to
deal with any other potential issues to keep my sanity ;-).


-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]