Hi David
On 02/03/2025 20:34, David Mandelberg wrote:
Op 2025-03-02 om 09:17 schreef Phillip Wood:
On 02/03/2025 07:45, David Mandelberg via GitGitGadget wrote:
From: David Mandelberg <david@xxxxxxxxxxxxxx>
This sounds like a useful improvement and I like the idea, but I think
running "git for-each-ref" once for each remote is not going to scale
very well for people who have a lot of remotes. I think it would be
better to try and strip "refs/remote/$remote/" outside of "git for-
each- ref". I've not tested it but I think something like
Good point, I hadn't thought of that. Do you have a rough estimate of
what "a lot of remotes" is? 100ish, maybe?
I'm not really sure what a "large" number looks like for people who
track a lot of repositories. On windows forking is pretty slow (our
shell-based test suite is incredibly slow on that platform) so I think
it makes sense to try and avoid adding new processes.
I'd like to do some testing
to get actual performance numbers before trying to optimize this,
because I think the optimization has some drawbacks, see below.
If optimization is needed, another approach is to parallelize the forks:
{
local fer_pids=
for ...
do
__git for-each-ref ... &
fer_pids="$fer_pids $!"
done
test -z "$fer_pids" || wait $fer_pids
} | sort | uniq -u
That might cause spikes in cpu/memory/disk usage that aren't ideal though.
Yes, if there were a 100 remotes that's a bit of a fork-bomb.
local sed_cmd=
local remote
# ref names and therefore remote names cannot contain '*?[]^' so we
# only need to escape '.$/'. Using 'sort -r' means that if there is a
# remote called "github" and another called "github/alice" we will try
# and strip "github/alice" first.
for remote in $(__git_remotes | sort -r)
do
remote="${remote//./\\./}"
remote="${remote//\$/\\\$/}"
remote="${remote//\//\\\//}"
Just FYI since it took me hours to figure this out myself: I think this
would break tests on macos because of an old version of bash that
handles backslashes weirdly. I think removing the double quotes would
work around that issue, and be safe because word splitting doesn't
happen in assignments.
Thanks, I'm not familiar with bash's extensions to parameter
substitution. The completions can also but used under zsh
(git-completion.zsh is pretty much abandoned I think) but it looks like
bash and zsh agree on this expansion.
sed_cmd="${sed_cmd} -e s/^refs\/remotes\/$remote\/// -e t"
done
Mostly just a note to myself if I end up using this idea: I think a
space in $remote would break this because bash would split up the arg to
sed. There's probably some way to fix that with extra escaping though?
I should have said in the comment above that ref names cannot contain
space, tab, newline etc. so we don't have to worry about the shell's
word splitting splitting the patterns. They cannot contain backslashes
either.
__git for-each-ref --format="$fer_pfx$sfx" \
${GIT_COMPLETION_IGNORE_CASE+--ignore-case} \
"refs/remotes/*/$cur_*" "refs/remotes/*/$cur_*/**" |
This would search for $cur_ in the wrong place because * only matches
one path component, right? Changing to ** might help, but then it would
match in places more it shouldn't.
Oh, good point I'd missed that. Could we change sed to run with '-n' and
generate an expression that looks like
-e s|^refs/remotes/$remote/\($cur_\)|\1|p -e t
with the appropriate escaping of $remote and $cur_ so that we only print
the output of for-each-ref that matches $cur_?
Best Wishes
Phillip
sed $sed_cmd | sort -u
should work and means we're only forking three extra processes no
matter how many remotes the user has. I'm not sure if it changes the
output order when GIT_COMPLETION_IGNORE_CASE is set though.
Best Wishes
Phillip