Submodule, subtree, or something else?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,


First of all, I apologise for the wall of text that follows; obviously I
am bad at this.

My $DAYJOB is switching from Subversion to Git, primarily because of
it's distributed nature (we are scattered all across the globe), and the
ease of branching and merging.  One issue that has popped up is how to
manage code shared between multiple projects.

Our SVN setup used a shared repository for all projects, either using
externals for shared code, or, more often than not, simply merging the
code between projects as needed.  Ignoring the fact that merging with
SVN is somewhat cumbersome, overall it has worked quite well for us,
especially when combined with git-svn.

For external libraries that rarely change, submodules appear to be the
obvious choice when using Git.  On the other hand, I've found them
somewhat cumbersome to use, and subtree merging (either using git
subtree, or directly with git merge -s subtree) is closer to what we
were doing in SVN.  A major drawback of submodules in my opinion is the
inability to make a full clone from an existing one without having
access to the central repository, which is something I have to do from
time to time.

For internal libraries, the situation is even less clear.  For many of
these libraries, most of the development happens within the context of a
single project, with commits to main project being interleaved with
commits to the subproject(s), resulting in histories resembling:

 (using git submodule)

   A---B---S1---S2---C---S3
          ,´   ,´       ,´
     N---O----P----Q---R

 (using git subtree with --rejoin)

   A---B---N---O---M1---M2---Q---C---R---M3
                  /    /                /
             N'--O'---P--------Q'------R'

 (using merge -s subtree)

   A---B---M1---M2---C---M3
          /    /        /
     N---O----P----Q---R

where A, B and C are changes to the main project, N, O, P, Q and R are
changes to library code, and Sn and Mn are submodule updates and merge
commits, respectively.

>From what I have gathered, submodules have issues with branching and
merging, therefore, unless I'm mistaken, submodules are kinda out of
question.  Of the remaining two options, merging directly results in a
nicer history, but requires making all changes to the library repo first
(although I am quite sure that a similar effect can be achieved with
plumbing, similarly to how git subtree split works), and is harder to
use than git subtree.  Also, all three options can result in the main
project history being cluttered with extra commits.

Lastly, there is a particularly painful 3rd party library that has an
enormous amount of local modifications that are never going to make it
upstream, essentially making it a fork, project specific changes that
are required for one project, but would break others, separate language
bindings that access the internals (often requiring bug fixes to be made
simultaneously to both), and, if that wasn't enough, it *requires*
several source files to be modified for each individual project that
uses it.  It's a complete mess, but we're stuck with it for the existing
projects, as switching to an alternative would be too time consuming.


To sum up, I'm looking for something that would let us share code
between multiple projects, allow for:

1) separate histories with relatively easy branching and merging

2) distributed workflow without having to set up a multiple repositories
everywhere (eg. work <-> home <-> laptop)

3) to work on the shared code within a project using it

4) inspection of the complete history

5) modifications that are not shared with other projects

and would not result in lots of clutter in the history.

Repository size is somewhat less of an issue, because each submodule has
to be checked out anyway.

Submodules let you have #3, and #1, #2 and #5 to a point, after which it
becomes a pain.  git subtree allows #1, #2, #3 and #4, and #5 with some
pain (?), but results in duplicate commits.  Using subtree merge
strategy directly gives everything except #3, but is harder to use than
submodules or subtree.

Are there any other options beside these three for sharing (or in some
cases, not sharing) common code between projects using Git, that would
address the above points better?  Or, alternatively, ways to work around
the drawbacks of the existing tools?

Lastly, I will be grateful for any suggestions about how to handle the
messy case described above better.

Thanks,
Jānis

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]