Re: Submodule, subtree, or something else?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 21, 2015 at 3:47 PM, Jānis Rukšāns <janis.ruksans@xxxxxxxxx> wrote:
> Hello,
>
>
> First of all, I apologise for the wall of text that follows; obviously I
> am bad at this.
>
> My $DAYJOB is switching from Subversion to Git, primarily because of
> it's distributed nature (we are scattered all across the globe), and the
> ease of branching and merging.  One issue that has popped up is how to
> manage code shared between multiple projects.
>
> Our SVN setup used a shared repository for all projects, either using
> externals for shared code, or, more often than not, simply merging the
> code between projects as needed.  Ignoring the fact that merging with
> SVN is somewhat cumbersome, overall it has worked quite well for us,
> especially when combined with git-svn.
>
> For external libraries that rarely change, submodules appear to be the
> obvious choice when using Git.  On the other hand, I've found them
> somewhat cumbersome to use, and subtree merging (either using git
> subtree, or directly with git merge -s subtree) is closer to what we
> were doing in SVN.  A major drawback of submodules in my opinion is the
> inability to make a full clone from an existing one without having
> access to the central repository, which is something I have to do from
> time to time.

Can you elaborate on that a bit more?
git clone --recurse-submodules should do that no matter which remote
you contact?


>
> For internal libraries, the situation is even less clear.  For many of
> these libraries, most of the development happens within the context of a
> single project, with commits to main project being interleaved with
> commits to the subproject(s), resulting in histories resembling:
>
>  (using git submodule)
>
>    A---B---S1---S2---C---S3
>           ,´   ,´       ,´
>      N---O----P----Q---R
>
>  (using git subtree with --rejoin)
>
>    A---B---N---O---M1---M2---Q---C---R---M3
>                   /    /                /
>              N'--O'---P--------Q'------R'
>
>  (using merge -s subtree)
>
>    A---B---M1---M2---C---M3
>           /    /        /
>      N---O----P----Q---R
>
> where A, B and C are changes to the main project, N, O, P, Q and R are
> changes to library code, and Sn and Mn are submodule updates and merge
> commits, respectively.
>
> From what I have gathered, submodules have issues with branching and
> merging, therefore, unless I'm mistaken, submodules are kinda out of
> question.  Of the remaining two options, merging directly results in a
> nicer history, but requires making all changes to the library repo first
> (although I am quite sure that a similar effect can be achieved with
> plumbing, similarly to how git subtree split works), and is harder to
> use than git subtree.  Also, all three options can result in the main
> project history being cluttered with extra commits.
>
> Lastly, there is a particularly painful 3rd party library that has an
> enormous amount of local modifications that are never going to make it
> upstream, essentially making it a fork, project specific changes that
> are required for one project, but would break others, separate language
> bindings that access the internals (often requiring bug fixes to be made
> simultaneously to both), and, if that wasn't enough, it *requires*
> several source files to be modified for each individual project that
> uses it.  It's a complete mess, but we're stuck with it for the existing
> projects, as switching to an alternative would be too time consuming.
>
>
> To sum up, I'm looking for something that would let us share code
> between multiple projects, allow for:
>
> 1) separate histories with relatively easy branching and merging
>
> 2) distributed workflow without having to set up a multiple repositories
> everywhere (eg. work <-> home <-> laptop)
>
> 3) to work on the shared code within a project using it
>
> 4) inspection of the complete history
>
> 5) modifications that are not shared with other projects
>
> and would not result in lots of clutter in the history.
>
> Repository size is somewhat less of an issue, because each submodule has
> to be checked out anyway.
>
> Submodules let you have #3, and #1, #2 and #5 to a point, after which it
> becomes a pain.  git subtree allows #1, #2, #3 and #4, and #5 with some
> pain (?), but results in duplicate commits.  Using subtree merge
> strategy directly gives everything except #3, but is harder to use than
> submodules or subtree.
>
> Are there any other options beside these three for sharing (or in some
> cases, not sharing) common code between projects using Git, that would
> address the above points better?  Or, alternatively, ways to work around
> the drawbacks of the existing tools?
>
> Lastly, I will be grateful for any suggestions about how to handle the
> messy case described above better.
>
> Thanks,
> Jānis
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]