On Thu, Dec 6, 2018 at 12:09 PM Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: > > Hi, > > On Wed, 5 Dec 2018, Jeff King wrote: > > > The model that fits more naturally with how Git is implemented would be > > to use submodules. There you leak the hash of the commit from the > > private submodule, but that's probably obscure enough (and if you're > > really worried, you can add a random nonce to the commit messages in the > > submodule to make their hashes unguessable). > > I hear myself frequently saying: "Friends don't let friends use > submodules". It's almost like: "Some people think their problem is solved > by using submodules. Only now they have two problems." Blaming tools for their lack of evolution/development is not necessarily the right approach. I recall having patches rejected on this very mailing list that fixed obvious but minor good things like whitespaces and coding style, because it *might* produce merge conflicts. Would that situation warrant me to blame the lacks in the merge algorithm, or could you imagine a better way out? (No need to answer, it's purely to demonstrate that blaming tooling is not always the right approach; only sometimes it may be) > There are big reasons, after all, why some companies go for monorepos: it > is not for lack of trying to go with submodules, it is the problems that > were incurred by trying to treat entire repositories the same as single > files (or even trees): they are just too different. We could change that in more places. One example you might think of is the output of git-status that displays changed files. And in case of submodules it would just show "submodule changes", which we already differentiate into "dirty tree" and "different sha1 at HEAD". Instead we could have the output of all changed files recursively in the superprojects git-status output. Another example is the diff machinery, which already knows some basics such as embedding submodule logs or actual diffs. > In a previous life, I also tried to go for submodules, was burned, and had > to restart the whole thing. We ended up with something that might work in > this instance, too, although our use case was not need-to-know type of > encapsulation. What we went for was straight up modularization. So this is a "Fix the data instead of the tool", which seems to be a local optimization (i.e. you only have to do it once, such that it is cheaper to do than fixing the tool for that workgroup) ... and because everyone does that the tool never gets fixed. > What I mean is that we split the project up into over 100 individual > projects that are now all maintained in individual repositories, and they > are connected completely outside of Git, via a dependency management > system (in this case, Maven, although that is probably too Java-centric > for AMD's needs). Once you have the dependency management system in place, you will encounter the rare case of still wanting to change things across repository boundaries at the same time. Submodules offer that, which is why Android wants to migrate off of the repo tool, and there it seems natural to go for submodules. > I just wanted to throw that out here: if you can split up your project > into individual projects, it might make sense not to maintain them as > submodules but instead as individual repositories whose artifacts are > uploaded into a central, versioned artifact store (Maven, NuGet, etc). And > those artifacts would then be retrieved by the projects that need them. This is cool and industry standard. But once you happen to run in a bug that involves 2 new artifacts (but each of the new artifacts work fine on their own), then you'd wish for something like "git-bisect but across repositories". Submodules (in theory) allow for fine grained bisection across these repository boundaries, I would think. > I figure that that scheme might work for you better than submodules: I > could imagine that you need to make the build artifacts available even to > people who are not permitted to look at the corresponding source code, > anyway. This is a sensible suggestion, as they probably don't want to ramp up development on submodules. :-)