Re: submodules' shortcomings, was Re: RFC: display dirty submodule working directory in git gui and gitk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Let me pop here to support Johannes: I agree with every single point
he enumerated. Every. Single. Point.

For instance, I'd like to have a 'cmake' repository where I store all
the FindBlah.cmake modules, so that I can share them from every
repository, and not worry about users changing and committing in the
main project instead of the submodule. I can't. Subversion externals
still rule in that regard.

On Mon, Jan 4, 2010 at 11:29 PM, Johannes Schindelin
<Johannes.Schindelin@xxxxxx> wrote:
> Hi,
>
> On Mon, 4 Jan 2010, Jens Lehmann wrote:
>
>> Am 04.01.2010 10:44, schrieb Johannes Schindelin:
>> > The real problem is that submodules in the current form are not very
>> > well designed.
>>
>> IMVHO using the tree sha1 for a submodule seems to be the 'natural' way
>> to include another git repo. And it gives the reproducibility i expect
>> from a scm. Or am i missing something?
>
> You do remember the discussion at the Alles wird Git about the need for
> Subversion external-like behavior, right?
>
>> It looks to me as most shortcomings come from the fact that most git
>> commands tend to ignore submodules (and if they don't, like git gui and
>> gitk do now, they e.g. only show certain aspects of their state).
>
> It is not only ignoring.  It is not being able to cope with the state only
> submodules can be in (see below).
>
>> Submodules are in heavy use in our company since last year. Virtually
>> every patch i submitted for submodules came from that experience and
>> scratched an itch i or one of my colleagues had (and the situation did
>> already improve noticeably by the few things we changed). We are still
>> convinced that using submodules was the right decision. But some work
>> has still to be done to be able to use them easily and to get rid of
>> some pitfalls.
>
> Submodules may be the best way you have in Git for your workflow ATM.
> But that does not mean that the submodule design is in any way
> thought-through.
>
> Just a few shortcomings that do show up in my main project (and to a
> small extent in msysGit, as you are probably aware):
>
> - submodules were designed with a strong emphasis on not being forced to
>  check them out.  But Git makes it very unconvenient to actually check
>  submodules out, let alone check them out at clone-time.  And it is
>  outright impossible to _enforce_ a submodule to be checked out.
>
> - among other use cases, submodules are recommended for sharing content
>  between two different repositories. But it is part of the design that it
>  is _very_ easy to forget to commit, or push the changes in the submodule
>  that are required for the integrity of the superproject.
>
> - that use case -- sharing content between different repositories -- is
>  not really supported by submodules, but rather an afterthought.  This is
>  all too obvious when you look at the restriction that the shared content
>  must be in a single subdirectory.
>
> - submodules would be a perfect way to provide a fast-forward-only media
>  subdirectory that is written to by different people (artists) than to
>  the superproject (developers).  But there is no mechanism to enforce
>  shallow fetches, which means that this use case cannot be handled
>  efficiently using Git.
>
> - related are the use cases where it is desired not to have a fixed
>  submodule tip committed to the superproject, but always to update to the
>  current, say, master (like Subversion's externals).  This use case has
>  been wished away by the people who implemented submodules in Git.  But
>  reality has this nasty habit of ignoring your wishes, does it not?
>
> - there have been patches supporting rebasing submodules, i.e.
>  submodules where a "git submodule update" rebases the current branch to
>  the revision committed to the superproject rather than detaching the
>  HEAD, which everybody who ever contributed to a project with submodules
>  should agree is a useful thing. But the patches only have been discussed
>  to death, to the point where the discussion's information content was
>  converging to zero, yet the patches did not make it into Git.  (FWIW
>  this is one reason why I refuse to write patches to git-submodule.sh: I
>  refuse to let my time to be wasted like that.)
>
> - working directories with GIT_DIRs are a very different beast from single
>  files.  That alone leads to a _lot_ of problems.  The original design of
>  Git had only a couple of states for named content (AKA files): clean,
>  added, removed, modified.  The states that are possible with submodules
>  are for the most part not handled _at all_ by most Git commands (and it
>  is sometimes very hard to decide what would be the best way to handle
>  those states, either).  Just think of a submodule at a different
>  revision than committed in the superproject, with uncommitted changes,
>  ignored and unignored files, a few custom hooks, a bit of additional
>  metadata in the .git/config, and just for fun, a few temporary files in
>  .git/ which are used by the hooks.
>
> - while it might be called clever that the submodules' metadata are stored
>  in .gitmodules in the superproject (and are therefore naturally tracked
>  with Git), the synchronization with .git/config is performed exactly
>  once -- when you initialize the submodule.  You are likely to miss out
>  on _every_ change you pulled into the superproject.
>
> All in all, submodules are very clumsy to work with, and you are literally
> forced to provide scripts in the superproject to actually work with the
> submodules.
>
>> > In ths short run, we can paper over the shortcomings of the submodules
>> > by introducing a command line option "--include-submodules" to
>> > update-refresh, diff-files and diff-index, though.
>>
>> Maybe this is the way to go for now (and hopefully we can turn this
>> option on by default later because we did the right thing ;-).
>
> I do not think that --include-submodules is a good default.  It is just
> too expensive in terms of I/O even to check the status in a superproject
> with a lot of submodules.
>
> Besides, as long as there is enough reason to have out-of-Git alternative
> solutions such as repo, submodules deserve to be 2nd-class citizens.
>
> Ciao,
> Dscho
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Pau Garcia i Quiles
http://www.elpauer.org
(Due to my workload, I may need 10 days to answer)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]