[RFC PATCH 0/4] cache parent project's gitdir in submodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It's necessary for a superproject to know which submodules it contains.
However, historically submodules do not know they are anything but a
normal single-repo Git project (or a superproject, in nested-submodules
cases). This decision does help prevent us from having to support
divergent behaviors in submodule projects vs. superprojects, which makes
sure Git is (somewhat) less confusing for the reader, and helps simplify
our code.

One could imagine, though, some conveniences we could gain from
submodules learning added behavior (though not necessarily *different*
behavior) to provide more context about the state of the project as a
whole, and to make large submodule-based projects easier to work with.
One example is a series[1] I sent some time ago, adding a config to be
shared between the superproject and all its submodules. The RFC[2] I
sent around the same time mentions a few other examples, such as "git
status" run in the submodule noticing whether the superproject has a
commit referencing the submodule's current HEAD.

It's expensive and non-definitive to try and guess whether or not the
current repo is a submodule. submodule.c:get_superproject_working_tree()
does so by essentially running 'git -C .. ls-files -- <own-path>',
invoking an additional process. get_superproject_working_tree() is not
called often, so that's mostly fine. However, [1] attempted to include
an additional config located in the superproject's gitdir by running
'git -C .. rev-parse --git-dir' during startup - a little expensive in
the best case, because it's an extra process, but extremely expensive in
the case when the current repo is *not* a submodule, because we hunt all
the way up the filesystem looking for a '.git'. Adding that cost to
every startup is infeasible.

To that end, in this series I propose caching a path to the
superproject's gitdir - by having the superproject write that relative
path to the submodule's config on creation or update. The goal here is
*not* to say "If I am a submodule, I must have
submodule.superprojectGitDir set" - but instead to say "If I have
submodule.superprojectGitDir set, then I must be a submodule." That is,
I expect we will find edge cases where a submodule was introduced in
some interesting way that bypassed any of the patches below, and
therefore doesn't have the superproject's gitdir cached.

The combination of these two rules:
 - Anything relying on submodule.superprojectGitDir must be nice to
   have, but not essential, because
 - It's possible for a submodule to be valid without having
   submodule.superprojectGitDir set
makes me feel more comfortable with the idea of submodules learning
additional behavior based on this config. I feel pretty unconfident in
our ability to ensure that *every* submodule has this config set.

The series covers a few paths for introducing that config, which I'm
hoping covers most cases.
 - "git submodule update" (which seems to be part of the "git submodule
   init" flow)
 - "git submodule absorbgitdir" to convert a "git init"'d repo into a
   submodule

Notably, we can only really set this config when 'the_repository' is the
superproject - that appears to be the only time when we know the gitdirs
of both the superproject and the submodule.

I'm expecting folks may have a lot to say about this, so I look forward
to discussion :)

 - Emily

1: https://lore.kernel.org/git/20210423001539.4059524-1-emilyshaffer@xxxxxxxxxx
2: https://lore.kernel.org/git/YHofmWcIAidkvJiD@xxxxxxxxxx

Emily Shaffer (4):
  t7400-submodule-basic: modernize inspect() helper
  introduce submodule.superprojectGitDir cache
  submodule: cache superproject gitdir during absorbgitdirs
  submodule: cache superproject gitdir during 'update'

 builtin/submodule--helper.c        |  4 +++
 git-submodule.sh                   |  9 ++++++
 submodule.c                        | 10 ++++++
 t/t7400-submodule-basic.sh         | 49 ++++++++++++++----------------
 t/t7406-submodule-update.sh        | 10 ++++++
 t/t7412-submodule-absorbgitdirs.sh |  1 +
 6 files changed, 57 insertions(+), 26 deletions(-)

-- 
2.32.0.272.g935e593368-goog




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux