Stefan Beller <sbeller@xxxxxxxxxx> writes: > Jonathan brought up the following very long term vision: > Eventually the everyday git commands do not treat submodules > any special than trees, even the submodules git directory > may be non existent (everything is absorbed into the superproject); > so it really feels like a monorepo. That may be one valid option to have but I do not see a reason why it needs to be the only valid option. So I do not see why you are bringing it up in this thread. But that is a good starting point to discuss one possible future. Let's think aloud how that world would look like. * When you "git clone" a superproject (perhaps implicitly with the "--recurse-submodules" option), the top-level project and all of its submodules will be checked out on the same branch (presumably the 'master' branch, which is the default). * Your attempt to "git commit", "git branch", "git checkout -b", etc. inside a submodule will either fail, or will implicitly chdir up to the top-level superproject and turn into the corresponding command with "--recurse-submodules". * "git commit --recurse-submodules -a" from the top-level will grab all the local changes, depth-first and recursively, in submodules, makes a commit, binds the new commit to the index of the superproject that immediately contains the submodule and makes a commit in it, until it percolates all the way up to the superproject. When working in this mode, branches in submodules do not really matter; the gitlink in the superproject is the only thing that matters. * It naturally follows that between two adjacent commits C and C~1 in the superproject's history, the commit in a submodule bound to C and the commit in a submodule bound to C~1 are either the same (i.e. superproject made a commit but there was no change in the submodule), or they are in direct parent-child relationship (i.e. the local changes made to the submodule was recorded as a single commit when the superproject made the commit). * "git push --recurse-submodules" from the top-level will push the history of the superproject out, together with the history of the submodules. I think it is doable, but a mechanism to enumerate all the commits bound from submodules to a range of superproject's commits needs to be invented to drive the pack-objects for efficient object transfer. Having it would also help in fsck and gc---as branches are immaterial in the submodule repositories, commits in superprojects that are reachable from refs will have to serve as the connectivity anchors for commit DAG in the submodule histories. As long as we are talking about idealized future world (well, at least an idea of somebody's "ideal", not necessarily shared by everybody), I wonder if there is even any need to have commits in submodules in such a world. To realize such a "monorepo" world, you might be better off allowing a gitlink in the superproject to directly point at a tree object in a submodule repository (making them physically a single repository is an optional implementation detail I choose to ignore in this discussion).