As for compatibility between JGIT and GIT: We (the Apache maven-scm team with Shawn supporting us (thanks again for patiently answering my sometimes stupid questions)) are currently working on a JGIT SCM provider for maven. The commandline git-provider already works pretty ok since more than a year now and once we have the JGIT version too. all this gets tested automatically via our TCK suite. The TCK suite is pretty high-level, but at least all the fundamental stuff is then guaranteed to work for both implementations. One step on our road is to further 'abstract' the current jgit-core library and introduce a SimpleRepository which basically contains the most important git commands as Java calls (e.g. addRemote, fetch, ... ) [1]. So after having this it should be really easy to side-by-side compare the .git/* of e.g. git-clone uri vs SimpleRepository.clone(uri) LieGrue, strub [1] http://github.com/sonatype/JGit/ branch struberg --- Shawn O. Pearce <spearce@xxxxxxxxxxx> schrieb am Sa, 2.5.2009: > Von: Shawn O. Pearce <spearce@xxxxxxxxxxx> > Betreff: Re: Compatibility between git.git and jgit > An: "Nicolas Pitre" <nico@xxxxxxx> > CC: "Junio C Hamano" <gitster@xxxxxxxxx>, git@xxxxxxxxxxxxxxx > Datum: Samstag, 2. Mai 2009, 3:59 > Nicolas Pitre <nico@xxxxxxx> > wrote: > > On Fri, 1 May 2009, Shawn O. Pearce wrote: > > > > > On an unrelated note, someone asked me recently, > how do we ensure > > > compatibility in implementations between git.git > and jgit? > > > > Well... this is not exactly easy. As I said in > the past > > (http://marc.info/?l=git&m=121035043412788&w=2), I think > that the C > > version must remain the reference with regards to > protocols and on-disk > > data structures. > > I agree fully. > > > If people go wild with JGit and start making changes > > to data structures then it simply won't be Git > compatible anymore and > > the user base will get fragmented. > > Agree. We may see some prototyping happen in JGit > first on some > topics, and JGit may even support something earlier than > git.git, > e.g JGit has an amazon-s3:// transport that git.git doesn't > have. > But it also isn't widely used. > > > A formal compatibility test suite would imply that > every Git > > reimplementation should be compatible with the > reference C version. > > You could add some tests in your test suite which are > performed in > > parallel using JGit and the C git, and make sure that > the produced > > results are identical, etc. > > Yea, and to some extent we try to do that already in JGit, > but our > tests aren't complete enough in that area. > > > But to which extent should the C version remain > backward compatible with > > other implementations? Let's suppose a future > protocol extension is > > made and old unsuspecting C clients work just fine but > some other > > implementation crashes with it? > > This is what I think scares both myself and the folks that > have > recently asked me about compatibility. > > If JGit gets a broader user base, and suddenly it stops > working > against a newer C git-daemon because of a protocol change, > those > users are going to be pissed. Its no worse than the > "github can't > ever upgrade past 1.6.1" issue we had not too long ago. > > I think we're doing better these days about embedding file > format > version numbers into files (e.g. pack idx v2) to help alert > older > clients that the format is different. But we also > have a something > of a history of looking for "holes" in older C git parsers > in > order to wedge in new features where we didn't plan for > them in > the first place. E.g. the protocol capability slots > we have now. > > I think that as reimplementations become more popular, we > need to > rely less on extending things by exploiting parser quirks > in older > C git.git code, and rely more on at least explicit version > markers > that everyone can work with. > > > And the reference implementation cannot be held back > because > > of bugs in all alternative implementations. > > I agree. A bug is a bug. But I'd really like to > get away from the > trend where we exploit bugs in older C git.git > implementations to > add new functionality, because maybe JGit doesn't have that > same > bug and will fall flat on its face with that exploit. > > > As long as they're futzing^Wdeveloping on top of Jgit > then > > interoperability shouldn't be at risk. If people > would start adding new > > object types and pack formats and the like without > obtaining a consensus > > with people around the C version then I might get > extremely worried (and > > pissed) though. > > That's why JGit is BSD, so everyone can use the one f'king > library > and not risk fragmenting the Java market further. > > But yea, I'd be really pissed too if someone hacked up JGit > and made > it incompatible with anything else. Its a risk that > the liberal > BSD license permits. > > I'm really sort of hoping that the development momentum > around > git.git and JGit trying to keep up will keep them coming > back > to the canonical JGit for updates, forcing them to give > back any > hacks^Wimprovements they have made. If the > improvements really are > worthwhile, they can be easily ported over to C before they > become > widely used in JGit. > > > One defensive approach we could adopt is to use a > capability slot to > > identify the software version of each peer involved in > the network > > communication. The advantage would be for a > later Git version to avoid > > doing some things that are known to break with client > X or Y. Of course > > even such a scheme can be abused and misused, like on > some web sites if > > you don't have the "right" browser, leading some of > them to allow faking > > the User-Agent string, etc. But maybe the > upsides are more important > > than the downsides. This doesn't help with > on-disk interoperability, > > but this is probably less important than communication > interoperability. > > Blargh. I'm with you about the whole User-Agent > mess. > > Asking clients and servers to identify with implementation > and > version markers might be useful for analysis of > who-is-using-what, > but I don't think its a good way to negotiate between the > peers of > what functionality to enable or disable, or what bug > workarounds > to use. Reminds me of the Apache hack during output > to work around > an HTTP header parsing bug in Netscape 2 when the "\r\n" > pair was > exactly at byte 256 in the stream. *shudder* > > > FWIW, an EGit user recently complained that some random Git > hosting > site they were using couldn't work with EGit, but EGit > worked fine > with other sites, e.g. GitHub. Apparently this site's > SSH forced command > filter script didn't like EGit asking for "git upload-pack > 'path.git'". > > Its not strictly a Git protocol issue, how the client > launches > the remote process over SSH, but this random hosting site > was > apparently relying on C git's current calling convention > of > "git-upload-pack 'path.git'". > > Long story short, I claimed it was the hosting site's > bug. :-) > > -- > Shawn. > -- > To unsubscribe from this list: send the line "unsubscribe > git" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html