On Sat, Nov 23, 2024 at 12:01 AM Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> wrote: > > Contrary to expectations, passing a single candidate tag to "git > describe" is slower than not passing any --match options. > > $ time git describe --debug > ... > traversed 10619 commits > ... > v6.12-rc5-63-g0fc810ae3ae1 > > real 0m0.169s > > $ time git describe --match=v6.12-rc5 --debug > ... > traversed 1310024 commits > v6.12-rc5-63-g0fc810ae3ae1 > > real 0m1.281s > > In fact, the --debug output shows that git traverses all or most of > history. For some repositories and/or git versions, those 1.3s are > actually 10-15 seconds. > > This has been acknowledged as a performance bug in git [1], and a fix > is on its way [2]. However, no solution is yet in git.git, and even > when one lands, it will take quite a while before it finds its way to > a release and for $random_kernel_developer to pick that up. > > So rewrite the logic to use plumbing commands. For each of the > candidate values of $tag, we ask: (1) is $tag even an annotated > tag? (2) Is it eligible to describe HEAD, i.e. an ancestor of > HEAD? (3) If so, how many commits are in $tag..HEAD? > > I have tested that this produces the same output as the current script > for ~700 random commits between v6.9..v6.10. For those 700 commits, > and in my git repo, the 'make -s kernelrelease' command is on average > ~4 times faster with this patch applied (geometric mean of ratios). > > For the commit mentioned in Josh's original report [3], the > time-consuming part of setlocalversion goes from > > $ time git describe --match=v6.12-rc5 c1e939a21eb1 > v6.12-rc5-44-gc1e939a21eb1 > > real 0m1.210s > > to > > $ time git rev-list --count --left-right v6.12-rc5..c1e939a21eb1 > 0 44 > > real 0m0.037s > > [1] https://lore.kernel.org/git/20241101113910.GA2301440@xxxxxxxxxxxxxxxxxxxxxxx/ > [2] https://lore.kernel.org/git/20241106192236.GC880133@xxxxxxxxxxxxxxxxxxxxxxx/ > [3] https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@xxxxxxxxxx/ > > Reported-by: Sean Christopherson <seanjc@xxxxxxxxxx> > Closes: https://lore.kernel.org/lkml/ZPtlxmdIJXOe0sEy@xxxxxxxxxx/ > Reported-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx> > Closes: https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@xxxxxxxxxx/ > Signed-off-by: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> > --- > v4: > > - Switch the logic to make use of the return values from try_tag, > instead of asking whether $count has been set. No, please do not do this. As I replied in v3, my plan is to set -e, because otherwise the shell script is fragile. With this version, -e will not work in try_tag() because it is used in the if condition. > +try_tag() { > + tag="$1" > + > + # Is $tag an annotated tag? > + [ "$(git cat-file -t "$tag" 2> /dev/null)" = tag ] || return 1 > + > + # Is it an ancestor of HEAD, and if so, how many commits are in $tag..HEAD? > + # shellcheck disable=SC2046 # word splitting is the point here > + set -- $(git rev-list --count --left-right "$tag"...HEAD 2> /dev/null) > + > + # $1 is 0 if and only if $tag is an ancestor of HEAD. Use > + # string comparison, because $1 is empty if the 'git rev-list' > + # command somehow failed. > + [ "$1" = 0 ] || return 1 > + > + # $2 is the number of commits in the range $tag..HEAD, possibly 0. > + count="$2" Redundant double-quotes. -- Best Regards Masahiro Yamada