Shawn O. Pearce wrote:
Nicolas Pitre <nico@xxxxxxxxxxx> wrote:
On Sat, 27 Mar 2010, Scott Chacon wrote:
My stance has always been that the C Git is authoritative with regards to
formats and protocols. ??It's up to Github to fix their screw-up.
It is fixed and will be deployed soon, but really, there is no reason
to be snippy. It is a simple and minor mistake effecting very few
repositories (maybe 100 out of 730k)
What is the C Git stance on these 100 repositories then? Are they
now considered corrupt? Or is 100 enough in the wild that we have
to accept the problem, just like we accept the 10664 mode issue from
"ancient" Linux?
I would love to say "those are corrupt, sorry, fix your repository".
But we have traditionally tried to help our users, and not cause
them pain. Forcing a rewrite on these 100 projects to fix up the
corruption is going to be painful for them.
, and the only reason it's an
issue at all is that JGit is not following the authoritative CGit
implementation of basically ignoring it.
But again CGit's fsck is not ignoring this discrepancy. And if the CGit
core is otherwise silently accepting it then it is a mistake.
Right. I tend to agree. CGit was too lax here, fsck shouldn't
be issuing a warning, it should be a fatal error. Both CGit and
JGit are too lax by not failing when reading that tree during
normal processing.
CGit should treat the object as corrupt, output a message to that
effect, and continue checking the rest of the objects. Everything else
that traverses graph should exit with an error as soon as it tries
detects a corrupt object.
This would allow someone to use git-for-each-ref and git-rev-list to
prune the graph by deleting refs without trashing the entire repository.
Also, if we're all concerned about "Git reimplementation du jour"
deviations, then we need to focus on libifying Git so there isn't a
need for such re-implementations. I'm hoping to help with a possible
GSoC project on libgit2, but the lack of a linkable library will
ensure that re-implementations in nearly every useful language will
continue.
Don't get me wrong. I'm not against Git reimplementations per se, as
long as they rigorously implement the exact format and protocol from
CGit. In that sense it is important that the CGit fsck and verify-pack
tools be exploited on objects/packs produced by alternate Git
implementation systematically to find such issues.
When JGit had the tree sort order wrong, JGit was in the wrong,
and any repository which contained those corrupt trees had to be
fixed by rewriting them. IIRC it was only the JGit repository
itself that had this problem in the wild. But we fixed our code.
IMHO, this leading '0' thing is a similar breakage. We shouldn't
relax CGit or JGit to accept it just because the Ruby implementation
of Git got the tree encoding wrong. If anything, we should teach
these implementations to catch these sorts of problems earlier.
I agree. Now how can the git community help them help themselves?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html