Re: [PATCH v5 4/4] submodule: record superproject gitdir during 'update'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 08 2021, Emily Shaffer wrote:

> On Fri, Nov 05, 2021 at 09:51:12AM +0100, Ævar Arnfjörð Bjarmason wrote:
>> 
>> 
>> On Thu, Nov 04 2021, Emily Shaffer wrote:
>> 
>> > A recorded hint path to the superproject's gitdir might be added during
>> > 'git submodule add', but in some cases - like submodules which were
>> > created before 'git submodule add' learned to record that info - it might
>> > be useful to update the hint. Let's do it during 'git submodule
>> > update', when we already have a handle to the superproject while calling
>> > operations on the submodules.
>> >
>> > Signed-off-by: Emily Shaffer <emilyshaffer@xxxxxxxxxx>
>> > ---
>> >  git-submodule.sh            | 14 ++++++++++++++
>> >  t/t7406-submodule-update.sh | 12 ++++++++++++
>> >  2 files changed, 26 insertions(+)
>> >
>> > diff --git a/git-submodule.sh b/git-submodule.sh
>> > index 652861aa66..873d64eb99 100755
>> > --- a/git-submodule.sh
>> > +++ b/git-submodule.sh
>> > @@ -449,6 +449,20 @@ cmd_update()
>> >  			;;
>> >  		esac
>> >  
>> > +		# Cache a pointer to the superproject's common dir. This may have
>> > +		# changed, unless it's a fresh clone. Writes it to worktree
>> > +		# if applicable, otherwise to local.
>> > +		if test -z "$just_cloned"
>> > +		then
>> > +			sm_gitdir="$(git -C "$sm_path" rev-parse --absolute-git-dir)"
>> > +			relative_gitdir="$(git rev-parse --path-format=relative \
>> > +							 --prefix "${sm_gitdir}" \
>> > +							 --git-common-dir)"
>> > +
>> > +			git -C "$sm_path" config --worktree \
>> > +				submodule.superprojectgitdir "$relative_gitdir"
>> > +		fi
>> > +
>> 
>> Aside from the "is this really needed anymore?" comment on this caching
>> strategy in general I had in [1] does this really need to be adding new
>> shell code to git-submodule.sh given that we're actively trying to get
>> rid of it entirely and move it over to C.
>> 
>> I.e. here we've just called "git submodule--helper
>> run-update-procedure", and we pass it "$sm_path" (but not "$recursive",
>> but could easily do so).
>> 
>> It needs to clone this new submodule, so presumably it already has a
>> "$sm_gitdir" equivalent, and we can turn that into the same relative
>> path, no?
>> 
>> Can't we call this with a git_config_set*() somewhere in that code?
>> 
>> *investigates a bit*
>> 
>> Okey, so I see that [2] is part of a series that Atharva Raykar had a
>> version of (including this new git-submodule.sh code above) [3] rebased
>> on top of an older version of this topic. I.e. this bit is that part of that patch:
>> 
>> +       git_config_set_in_file(config_path, "submodule.superprojectGitdir",
>> +                              relative_path(get_git_dir(),
>> +                                            update_data->sm_path, &sb));
>> 
>> I also vaguely recall that Junio ejected his topic to make room for this
>> topic first.
>> 
>> Maybe I've missed some update on this but is his [2][3] broken in
>> combination with your topic here? And re[1] is whatever "caching" is
>> being done here still beneficial once this is all moved to C?
>> 
>> In your [4] there seems to be an agreement to do it the other way
>> around, but as noted in the mail linked from [1] maybe the caching isn't
>> needed anymore then?
>
> I answered vs. your other mail; yes, it's still needed, [...]

I replied just now (in [1], but it's not on lore yet, maybe vger ate my
mail again).

tl;dr: Ran some quick performance numbers, and can't reproduce any
slowdown at all when instrumenting the test suite to run another
setup_git_directory() that'll do a nested lookup on pretty much every
git command, and running the test suite (you mentioned a ~5x overall
slowdown).

Anyway, I think the two replies you've got here only partially address
what I pointed out, in particular in [2]:
    
    But I'm a bit iffy on a series that's structured in a way as to not
    start by asserting that we should have given semantics without the
    cache, and then adds the cache later as an optional optimization.

I.e. even if it was 10x as slow I think it should still be structured in
such a way as to at least have some GIT_TEST_* knob to turn it off in
favor of a slow path.

E.g. we've got commit-graph paths that are probably 100x or 1000x faster
than the slow path, but having the cache-less ones means we can validate
their correctness.
    
> and last I checked Atharva's series had been kicked out to make room
> for this one.

Indeed, but as a comment on this proposed series that doesn't really
address this in my above-quoted ([3])

    I.e. here we've just called "git submodule--helper
    run-update-procedure", and we pass it "$sm_path" (but not "$recursive",
    but could easily do so).

I.e. my understanding is that Junio ejected Atharva's because this
seemed like the smaller change, but perhaps it wasn't realized that we'd
be adding shellscript only to remove it again?

But in any case, even if we're patching git-submodule.sh not having
Atharva's go first in its entirety doesn't answer why we can't extract
the bits he/you came up with in C for this series.

I.e. that git-submodule.sh wouldn't need that shelling out (since it
just called a helper, that helper could just return this data too),
that's just a few lines above the hunk in this series.

I.e. if some remaining performance issue I couldn't reproduce in [1] is
due to the shellscript version of this v.s. calling a C function in
libgit isn't it better to focus on closing that gap than having the
caching mechanism?

(Per the above & linked mails I'm still not 100% sure this even *is* a
caching mechanism, or if we store primary data in it, but I'm just
trying to go by your commit message descriptions).

1. https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@xxxxxxxxxxxxxxxxxxx/
2. https://lore.kernel.org/git/87r1cnfkqx.fsf@xxxxxxxxxxxxxxxxxxx/
3. https://lore.kernel.org/git/211105.86wnlngebr.gmgdl@xxxxxxxxxxxxxxxxxxx/

>> 
>> 1. https://lore.kernel.org/git/211105.861r3vhtot.gmgdl@xxxxxxxxxxxxxxxxxxx/
>> 2. https://lore.kernel.org/git/20211013051805.45662-8-raykar.ath@xxxxxxxxx/
>> 3. https://lore.kernel.org/git/20211013051805.45662-1-raykar.ath@xxxxxxxxx/
>> 4. https://lore.kernel.org/git/YWiXL+plA7GHfuVv@xxxxxxxxxx/





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux