On Mon, Apr 30, 2018 at 5:38 PM, Stefan Beller <sbeller@xxxxxxxxxx> wrote: > On Mon, Apr 30, 2018 at 1:45 PM, Avery Pennarun <apenwarr@xxxxxxxxx> wrote: > No objections from me either. > > Submodules seem to serve a slightly different purpose, though? I think the purpose is actually the same - it's just the tradeoffs that are difference. Both sets of tradeoffs kind of suck. > With Subtrees the superproject always contains all the code, > even when you squash the subtree histroy when merging it in. > In the submodule world, you may not have access to one of the > submodules. Right. Personally I think it's a disadvantage of subtree that it always contains all the code (what if some people don't want the code for a particular build variant?). However, it's a huge pain that submodules *don't* contain all the code (what if I'm not online right now, or the site supposedly containing the code goes offline, or I want to make my own fork?). For the best of both worlds, I've often thought that a good balance would be to use the same data structure that submodule uses, but to store all the code in a single git repo under different refs, which we might or might not download (or might or might not have different ACLs) under different circumstances. However, when some projects get really huge (lots of very big submodule dependencies), then repacking one-big-repo starts becoming unwieldy; in that situation git-subtree also fails completely. > Submodules do not need to produce a synthetic project history > when splitting off again, as the history is genuine. This allows > for easier work with upstream. Splitting for easier work upstream is great, and there really ought to be an official version of 'git subtree split', which is good for all sorts of purposes. However, I suspect almost all uses of the split feature are a) splitting a subtree that you previously merged in, or b) splitting a subtree into a separate project that you want to maintain separately from now on. Repeated splits in case (a) are only necessary because you're not using submodules, or in case (b) are only necessary because you didn't *switch* to submodules when it finally came time to split the projects. (In both cases you probably didn't switch to submodules because you didn't like one of its tradeoffs, especially the need to track multiple repos when you fork.) > Subtrees present you the whole history by default and the user > needs to be explicit about not wanting to see history from the > subtree, which is the opposite of submodules (though this > may be planned in the future to switch). It turns out that AFAIK, almost everyone prefers 'git subtree --squash', which squashes into a single commit each time you merge, much like git submodule does. I doubt people would cry too much if the full-history feature went away. There's one exception, which is doing a one-time permanent merge of two projects into one. That's a nice feature, but is probably used extremely rarely. More often people get into a merge-split-merge-split cycle that would be better served by a slightly improved git-submodule. >> The gerrit team (eg. Stefan Beller) has been doing some really great >> stuff to make submodules more usable by helping with relative >> submodule links and by auto-updating links in supermodules at the >> right times. Unfortunately doing that requires help from the server >> side, which kind of messes up decentralization and so doesn't solve >> the problem in the general case. > > Conceptually Gerrit is doing > > while true: > git submodule update --remote > if worktree is dirty: > git commit "update the submodules" > > just that Gerrit doesn't poll but does it event based. ...and it's super handy :) The problem is it's fundamentally centralized: because gerrit can serialize merges into the submodule, it also knows exactly how to update the link in the supermodule. If there was wild branching and merging (as there often is in git) and you had to resolve conflicts between two submodules, I don't think it would be obvious at all how to do it automatically when pushing a submodule. (This also works quite badly with git subtree --squash.) >> I really wish there were a good answer, but I don't know what it is. >> I do know that lots of people seem to at least be happy using >> git-subtree, and would be even happier if it were installed >> automatically with git. > > https://trends.google.com/trends/explore?date=all&q=git%20subtree,git%20submodule > > Not sure what to make of this data. Clearly people need a lot more help when using submodules than when using subtree :) Have fun, Avery