Re: Compatibility between git.git and jgit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As for compatibility between JGIT and GIT:

We (the Apache maven-scm team with Shawn supporting us (thanks again for patiently answering my sometimes stupid questions)) are currently working on a JGIT SCM provider for maven. The commandline git-provider already works pretty ok since more than a year now and once we have the JGIT version too. all this gets tested automatically via our TCK suite.

The TCK suite is pretty high-level, but at least all the fundamental stuff is then guaranteed to work for both implementations.

One step on our road is to further 'abstract' the current jgit-core library and introduce a SimpleRepository which basically contains the most important git commands as Java calls (e.g. addRemote, fetch, ... ) [1]. So after having this it should be really easy to side-by-side compare the .git/* of e.g. git-clone uri vs SimpleRepository.clone(uri)


LieGrue,
strub

[1] http://github.com/sonatype/JGit/ branch struberg
--- Shawn O. Pearce <spearce@xxxxxxxxxxx> schrieb am Sa, 2.5.2009:

> Von: Shawn O. Pearce <spearce@xxxxxxxxxxx>
> Betreff: Re: Compatibility between git.git and jgit
> An: "Nicolas Pitre" <nico@xxxxxxx>
> CC: "Junio C Hamano" <gitster@xxxxxxxxx>, git@xxxxxxxxxxxxxxx
> Datum: Samstag, 2. Mai 2009, 3:59
> Nicolas Pitre <nico@xxxxxxx>
> wrote:
> > On Fri, 1 May 2009, Shawn O. Pearce wrote:
> > 
> > > On an unrelated note, someone asked me recently,
> how do we ensure
> > > compatibility in implementations between git.git
> and jgit?
> > 
> > Well... this is not exactly easy.  As I said in
> the past 
> > (http://marc.info/?l=git&m=121035043412788&w=2), I think
> that the C 
> > version must remain the reference with regards to
> protocols and on-disk 
> > data structures.
> 
> I agree fully.
> 
> > If people go wild with JGit and start making changes 
> > to data structures then it simply won't be Git
> compatible anymore and 
> > the user base will get fragmented.
> 
> Agree.  We may see some prototyping happen in JGit
> first on some
> topics, and JGit may even support something earlier than
> git.git,
> e.g JGit has an amazon-s3:// transport that git.git doesn't
> have.
> But it also isn't widely used.
> 
> > A formal compatibility test suite would imply that
> every Git 
> > reimplementation should be compatible with the
> reference C version.  
> > You could add some tests in your test suite which are
> performed in 
> > parallel using JGit and the C git, and make sure that
> the produced 
> > results are identical, etc.
> 
> Yea, and to some extent we try to do that already in JGit,
> but our
> tests aren't complete enough in that area.
>  
> > But to which extent should the C version remain
> backward compatible with 
> > other implementations?  Let's suppose a future
> protocol extension is 
> > made and old unsuspecting C clients work just fine but
> some other 
> > implementation crashes with it?
> 
> This is what I think scares both myself and the folks that
> have
> recently asked me about compatibility.
> 
> If JGit gets a broader user base, and suddenly it stops
> working
> against a newer C git-daemon because of a protocol change,
> those
> users are going to be pissed.  Its no worse than the
> "github can't
> ever upgrade past 1.6.1" issue we had not too long ago.
> 
> I think we're doing better these days about embedding file
> format
> version numbers into files (e.g. pack idx v2) to help alert
> older
> clients that the format is different.  But we also
> have a something
> of a history of looking for "holes" in older C git parsers
> in
> order to wedge in new features where we didn't plan for
> them in
> the first place.  E.g. the protocol capability slots
> we have now.
> 
> I think that as reimplementations become more popular, we
> need to
> rely less on extending things by exploiting parser quirks
> in older
> C git.git code, and rely more on at least explicit version
> markers
> that everyone can work with.
> 
> > And the reference implementation cannot be held back
> because 
> > of bugs in all alternative implementations.
> 
> I agree.  A bug is a bug.  But I'd really like to
> get away from the
> trend where we exploit bugs in older C git.git
> implementations to
> add new functionality, because maybe JGit doesn't have that
> same
> bug and will fall flat on its face with that exploit.
> 
> > As long as they're futzing^Wdeveloping on top of Jgit
> then 
> > interoperability shouldn't be at risk.  If people
> would start adding new 
> > object types and pack formats and the like without
> obtaining a consensus 
> > with people around the C version then I might get
> extremely worried (and 
> > pissed) though.
> 
> That's why JGit is BSD, so everyone can use the one f'king
> library
> and not risk fragmenting the Java market further.
> 
> But yea, I'd be really pissed too if someone hacked up JGit
> and made
> it incompatible with anything else.  Its a risk that
> the liberal
> BSD license permits.
> 
> I'm really sort of hoping that the development momentum
> around
> git.git and JGit trying to keep up will keep them coming
> back
> to the canonical JGit for updates, forcing them to give
> back any
> hacks^Wimprovements they have made.  If the
> improvements really are
> worthwhile, they can be easily ported over to C before they
> become
> widely used in JGit.
>  
> > One defensive approach we could adopt is to use a
> capability slot to 
> > identify the software version of each peer involved in
> the network 
> > communication.  The advantage would be for a
> later Git version to avoid 
> > doing some things that are known to break with client
> X or Y.  Of course 
> > even such a scheme can be abused and misused, like on
> some web sites if 
> > you don't have the "right" browser, leading some of
> them to allow faking 
> > the User-Agent string, etc.  But maybe the
> upsides are more important 
> > than the downsides.  This doesn't help with
> on-disk interoperability, 
> > but this is probably less important than communication
> interoperability.
> 
> Blargh.  I'm with you about the whole User-Agent
> mess.
> 
> Asking clients and servers to identify with implementation
> and
> version markers might be useful for analysis of
> who-is-using-what,
> but I don't think its a good way to negotiate between the
> peers of
> what functionality to enable or disable, or what bug
> workarounds
> to use.  Reminds me of the Apache hack during output
> to work around
> an HTTP header parsing bug in Netscape 2 when the "\r\n"
> pair was
> exactly at byte 256 in the stream.  *shudder*
> 
> 
> FWIW, an EGit user recently complained that some random Git
> hosting
> site they were using couldn't work with EGit, but EGit
> worked fine
> with other sites, e.g. GitHub.  Apparently this site's
> SSH forced command
> filter script didn't like EGit asking for "git upload-pack
> 'path.git'".
> 
> Its not strictly a Git protocol issue, how the client
> launches
> the remote process over SSH, but this random hosting site
> was
> apparently relying on C git's current calling convention
> of
> "git-upload-pack 'path.git'".
> 
> Long story short, I claimed it was the hosting site's
> bug.  :-)
> 
> -- 
> Shawn.
> --
> To unsubscribe from this list: send the line "unsubscribe
> git" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


      
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]