Re: [RFC] On the --depth argument when fetching with submodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stefan Beller <sbeller@xxxxxxxxxx> writes:

> Currently when cloning a project, including submodules, the --depth argument
> is passed on recursively, i.e. when cloning with "--depth 2", both the
> superproject as well as the submodule will have a depth of 2.  It is not
> garantueed that the commits as specified by the superproject are included
> in these 2 commits of the submodule.
>
> Illustration:
> (superproject with depth 2, so A would have more parents, not shown)
>
> superproject/master: A <- B
>                     /      \
> submodule/master:  C <- D <- E <- F <- G
>
> (Current behavior is to fetch G and F)

I think the issue is deeper than merely "--depth 2", and you would
be better off stepping back and think about various use cases to
make sure that we know what kind of behaviour we want to support
before delving into one particular corner case.  We currently pass
the depth recursively, and I do not think it makes much sense, but I
view it as a secondary question "among the behaviours we want to
support, which one should be the default?"  It may turn out that not
passing it recursively at all, or even passing a different depth, is
a better default, but we wouldn't know until we know what are the
desirable behaviour in various workflows.

If you are actively working on the superproject plus some submodules
but you are merely using the submodule you depicted above, not
working on changing it, even when you want the full history of the
superproject (i.e. no "--depth 2"), you may not want history of the
submodule.  Even though we have a way to say "I am not interested in
this submodule AT ALL" by not doing "submodule init", not having
anything at all at the path submodule/ may not allow you to build
the whole thing, and we currently lack a way to express "I am not
interested in the history of this thing, but I need at least the
tree that matches the commit referred to by the superproject".

If you are working on a single submodule, trying to fix a bug in the
context of the whole project, you might want to have a single-depth
clone of the superproject and all other submodules, plus the whole
history of the single submodule.

In either of these examples, the top-level "--depth" does not have
much to do with what depth the user wants to use when cloning or
fetching the submodule repositories.

I have a feeling (but I would not be surprised if somebody who uses
submodules heavily has a counter-example from real life) that
regardless of "--depth" or full clone, fetching the tip of matching
branch is not a good default behaviour.  In your picture, even when
depth is not given at all, there isn't much point fetching F or G.

> So to fetch the correct submodule commits, we need to
> * traverse the superproject and list all submodule commits.
> * fetch these submodule commits (C and E) by sha1

I do not think requiring that C to be fetched when the superproject
is cloned with --depth=2 (hence A and B are present in the result)
is a good definition of "correct submodule commits".  The initial
clone could be "superproject follows --depth, all submodules are
cloned with --depth=1 at the commits referenced by the superproject
tree"--by that definition, you need E but you do not want C.

As a specification of the behaviour, the above two might work, but I
do not think that should be the implementation.  In other words,
"The implementation should behave as if it did the above two" is OK,
and it is also OK to qualify with further conditions to help the
implementation.  For example, the current structure assumes that E
and C are reachable from "some" ref in submodule, so that at least a
whole clone of the submodule would give them to you--otherwise you
would not be able to even build the superproject at A or B.  Perhaps
it is OK to further require that, when you are working in a single
branch mode and working on 'master', you are required to have
commits C and E reachable on the 'master' branch in the submodule,
and that may lets you limit the need for such scanning of the
history?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]