On Mon, Jul 24, 2017 at 3:23 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Junio C Hamano <gitster@xxxxxxxxx> writes: > >> Also, while I do agree with you that the problem exists, it is >> unclear why this patch is a solution and not a hack that sweeps a >> problem under the rug. >> >> It is unclear why this "silently detach HEAD without telling the >> user" is a better solution than erroring out, for example [*1*]. > > Just to avoid possible confusion; I am not claiming that it would be > more (or less for that matter) sensible to error out than silently > detaching HEAD, because I am not giving the reason to substantiate > the claim and I do not have a strong opinion to favour which one (or > another potential solution, if any). > > I am just saying that the patch that proposes a solution should be > backed with an explanation why it is a good idea, especially when > there are obvious alternatives that are not so clearly inferior. > > Thanks. So I took a step back and wrote about different proposals where we want to go long term. See below. This will help us figuring out how to approach this bug correctly. ------ RFC: A new type of symbolic refs A symbolic ref can currently only point at a ref or another symbolic ref. This proposal show cases different scenarios on how this could change in the future. A: HEAD pointing at the superprojects index =========================================== Introduce a new symbolic ref that points at the superprojects index of the gitlink. The format is "repo:" <superprojects gitdir> '\0' <gitlink-path> '\0' Ref read operations ------------------- e.g. git log HEAD Just like existing symrefs, the content of the ref will be read and followed. On reading "repo:", the sha1 will be obtained equivalent to: git -C <superproject> ls-files -s <gitlink-path> | awk '{ print $2}' In case of error (superproject not found, gitlink path does not exist), the ref is broken and Ref write operations driven by the submodule, affecting symrefs --------------------------------------------------------------- e.g. git checkout <other branch> (in the submodule) In this scenario only the HEAD is optionally attached to the superproject, so we can rewrite the HEAD to be anything else, such as a branch just fine. Once the HEAD is not pointing at the superproject any more, we'll leave the submodule alone in operations driven by the superproject. Ref write operations driven by the submodule, affecting target ref ------------------------------------------------------------------ e.g. git commit, reset --hard, update-ref (in the submodule) The HEAD stays the same, pointing at the superproject. The gitlink is changed to the target sha1, using git -C <superproject> update-index --add \ --cacheinfo 160000,$SHA1,<gitlink-path> This will affect the superprojects index, such that then a commit in the superproject is needed. Ref write operations driven by the superproject, changing the gitlink --------------------------------------------------------------------- e.g. git checkout <tree-ish>, git reset --hard (in the superproject) This will change the gitlink in the superprojects index, such that the HEAD in the submodule changes, which would trigger an update of the submodules working tree. Consistency considerations (gc) ------------------------------- e.g. git gc --aggressive --prune=now The repacking logic is already aware of a detached HEAD, such that using this new symref mechanism would not generate problems as long as we keep the HEAD attached to the superproject. However when commits/objects are created while the HEAD is attached to the superproject and then HEAD switches to a local branch, there are problems with the created objects as they seem unreachable now. This problem is not new as a superproject may record submodule objects that are not reachable from any of the submodule branches. Such objects fall prey to overzealous packing in the submodule. This proposal however exposes this problem a lot more, as the submodule has fewer needs for branches. B: HEAD pointing at a superprojects branch ========================================== Instead of pointing at the index of the superproject, we also encode a branch name: repo:" <superprojects gitdir> '\0' <gitlink-path> '\0' branch '\0' Ref read operations ------------------- e.g. git log HEAD This is similar to the case of pointing at the index, except that the reading operation reads from the tip of the branch: git -C <superproject> ls-tree <superproject branch> -- \ <gitlink-path> | awk '{ print $3}' Ref write operations driven by the submodule, affecting symrefs --------------------------------------------------------------- e.g. git checkout <other branch> (in the submodule) HEAD will be pointed at the local target branch, dropping the affliation to the superproject. Ref write operations driven by the submodule, affecting target ref ------------------------------------------------------------------ e.g. git commit, reset --hard, update-ref (in the submodule) As we're pointing at the superprojects branch, this would have to create a dummy(?) commit in the superproject, that just changes the submodule pointer in the superprojects branch, such that the operation of storing a new sha1 for the submodule is equivalent to git -C <superproject> update-index --add \ --cacheinfo 160000,$SHA1,<gitlink-path> git -C <superproject> commit -m "Update submodule" This behavior in the superproject is similar to Gerrits subscription model where superprojects are updated from the submodule. Each operation in the submodule triggers a local superproject commit. Ref write operations driven by the superproject, changing the gitlink --------------------------------------------------------------------- e.g. git merge, git pull (in the superproject) This will change the gitlink in the superprojects index, such that the HEAD in the submodule changes, which would trigger an update of the submodules working tree. This would require a good merge strategy for submodules, i.e. on merge the submodule would create a merge commit that is recorded in the superprojects merge commit. Consistency considerations (gc) ------------------------------- e.g. git gc --aggressive --prune=now The repacking problem comes with a solution unlike the previous proposal. This is because any relevant commit in the submodule is recorded in the superproject via a commit in a branch. Then even non-fast-forward histories in the submodule can all be kept by walking the superproject and looking at all gitlink entries of the submodule. C: All branches are symbolic references to the superproject =========================================================== Instead of having just HEAD pointed at a superproject, all(!) branches in the submodule point at the superprojects branch of the same name. Symbolic refs that resolve to a local sha1 are not allowed, any symbolic ref ends up pointing at the superproject eventually. e.g. HEAD points at a submodule branch, which in turn points at the superproject branch of the same name. Ref read operations ------------------- e.g. git log HEAD is read, which may be either (a) locally detached or (b) pointing at a superproject branch. Resolve as in B. Ref write operations driven by the submodule, affecting symrefs --------------------------------------------------------------- e.g. git checkout <other branch> (in the submodule) As there is no other local branch, HEAD would point at the other submodule branch, which then points at another branch in the superproject. Ref write operations driven by the submodule, affecting target ref ------------------------------------------------------------------ e.g. git commit, reset --hard, update-ref (in the submodule) same as B. Ref write operations driven by the superproject, changing the gitlink --------------------------------------------------------------------- e.g. git merge, git pull (in the superproject) same as B. Consistency considerations (gc) ------------------------------- e.g. git gc --aggressive --prune=now As the superproject contains all knowledge, the gc starts with a walk of all superproject branches, destilling the recorded gitlink entries and then starts walking in the submodule from all the recorded gitlinks to create a pack. gc and repacking would either be forbidden in the submodule or deflected to the superproject.