On Mon, Apr 30, 2018 at 2:53 PM, Avery Pennarun <apenwarr@xxxxxxxxx> wrote: > For the best of both worlds, I've often thought that a good balance > would be to use the same data structure that submodule uses, but to > store all the code in a single git repo under different refs, which we > might or might not download (or might or might not have different > ACLs) under different circumstances. There has been some experimentation with having a simpler ref surface on the submodule side, https://public-inbox.org/git/cover.1512168087.git.jonathantanmy@xxxxxxxxxx/ The way you describe the future of submodules, all we'd have to do is to teach git-clone how to select the the "interesting" refs for your use case. Any other command would assume all submodule data to be in the main repository. The difference to Jonathans proposal linked above, would be the object store to be in the main repo and the refs to be prefixed per submodule instead of "shadowed". > However, when some projects get > really huge (lots of very big submodule dependencies), then repacking > one-big-repo starts becoming unwieldy; in that situation git-subtree > also fails completely. Yes, but that is a general scaling problem of Git that could be tackled, e.g. repack into multiple packs serially instead of putting everything into one pack. >> Submodules do not need to produce a synthetic project history >> when splitting off again, as the history is genuine. This allows >> for easier work with upstream. > > Splitting for easier work upstream is great, and there really ought to > be an official version of 'git subtree split', which is good for all > sorts of purposes. > > However, I suspect almost all uses of the split feature are a) > splitting a subtree that you previously merged in, or b) splitting a > subtree into a separate project that you want to maintain separately > from now on. Repeated splits in case (a) are only necessary because > you're not using submodules, or in case (b) are only necessary because > you didn't *switch* to submodules when it finally came time to split > the projects. (In both cases you probably didn't switch to submodules > because you didn't like one of its tradeoffs, especially the need to > track multiple repos when you fork.) That makes sense. > > There's one exception, which is doing a one-time permanent merge of > two projects into one. That's a nice feature, but is probably used > extremely rarely. More often people get into a > merge-split-merge-split cycle that would be better served by a > slightly improved git-submodule. This rare use case is how git-subtree came into existence in gits contrib directory AFAICT, https://kernel.googlesource.com/pub/scm/git/git/+/634392b26275fe5436c0ea131bc89b46476aa4ae which is interesting to view in git-show, but I think defaults could be tweaked there, as it currently shows me mostly a license file. >> Conceptually Gerrit is doing >> >> while true: >> git submodule update --remote >> if worktree is dirty: >> git commit "update the submodules" >> >> just that Gerrit doesn't poll but does it event based. > > ...and it's super handy :) The problem is it's fundamentally > centralized: because gerrit can serialize merges into the submodule, > it also knows exactly how to update the link in the supermodule. If > there was wild branching and merging (as there often is in git) and > you had to resolve conflicts between two submodules, I don't think it > would be obvious at all how to do it automatically when pushing a > submodule. (This also works quite badly with git subtree --squash.) With the poll based solution I don't think you'd run into many more problems than you would with Gerrits solution. In a nearby thread, we were just discussing the submodule merging strategies, https://public-inbox.org/git/1524739599.20251.17.camel@xxxxxxxxxxxxx/ which might seem confusing, but the implementation is actually easy as we just fastforward-only in submodules. >> >> https://trends.google.com/trends/explore?date=all&q=git%20subtree,git%20submodule >> >> Not sure what to make of this data. > > Clearly people need a lot more help when using submodules than when > using subtree :) That could be true. :) Thanks, Stefan