Re: [PATCH v2 3/7] subtree: persist cache between split runs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tom,

On Tue, 6 Oct 2020, Tom Clarkson via GitGitGadget wrote:

> @@ -48,6 +49,7 @@ annotate=
>  squash=
>  message=
>  prefix=
> +clearcache=

It might be more consistent to call it `clear_cache` (i.e. with an
underscore), just like `ignore_joins`.

>
>  debug () {
>  	if test -n "$debug"
> @@ -131,6 +133,9 @@ do
>  	--no-rejoin)
>  		rejoin=
>  		;;
> +	--clear-cache)
> +		clearcache=1
> +		;;
>  	--ignore-joins)
>  		ignore_joins=1
>  		;;
> @@ -206,9 +211,13 @@ debug "opts: {$*}"
>  debug
>
>  cache_setup () {
> -	cachedir="$GIT_DIR/subtree-cache/$$"
> -	rm -rf "$cachedir" ||
> -		die "Can't delete old cachedir: $cachedir"
> +	cachedir="$GIT_DIR/subtree-cache/$prefix"

Excellent, the `prefix` should be "unique enough".

> +	if test -n "$clearcache"
> +	then
> +		debug "Clearing cache"
> +		rm -rf "$cachedir" ||
> +			die "Can't delete old cachedir: $cachedir"
> +	fi
>  	mkdir -p "$cachedir" ||
>  		die "Can't create new cachedir: $cachedir"
>  	mkdir -p "$cachedir/notree" ||
> @@ -266,6 +275,16 @@ cache_set () {
>  	echo "$newrev" >"$cachedir/$oldrev"
>  }
>
> +cache_set_if_unset () {
> +	oldrev="$1"
> +	newrev="$2"

`local`? ;-)

> +	if test -e "$cachedir/$oldrev"
> +	then
> +		return
> +	fi
> +	echo "$newrev" >"$cachedir/$oldrev"

So that directory contains commit mappings, a file for each mapped
revision.

Thinking back to patch 2/11, I am now no longer that sure that it makes
sense to fill it up with every commit in that commit range: performance
suffers when directories contain too many files.

For example, I had a case in the past where it took a minute just to
enumerate a directory, and even looking whether a file existed in that
directory was not exactly fun.

In any case, I would write it slightly shorter:

	test -e "$cachedir/$oldrev" ||
	echo "$newrev" >"$cachedir/$oldrev"

> +}
> +
>  rev_exists () {
>  	if git rev-parse "$1" >/dev/null 2>&1
>  	then
> @@ -375,13 +394,13 @@ find_existing_splits () {
>  			then
>  				# squash commits refer to a subtree
>  				debug "  Squash: $sq from $sub"
> -				cache_set "$sq" "$sub"
> +				cache_set_if_unset "$sq" "$sub"
>  			fi
>  			if test -n "$main" -a -n "$sub"
>  			then
>  				debug "  Prior: $main -> $sub"
> -				cache_set $main $sub
> -				cache_set $sub $sub
> +				cache_set_if_unset $main $sub
> +				cache_set_if_unset $sub $sub
>  				try_remove_previous "$main"
>  				try_remove_previous "$sub"
>  			fi
> @@ -688,6 +707,8 @@ process_split_commit () {
>  		if test -n "$newparents"
>  		then
>  			cache_set "$rev" "$rev"
> +		else
> +			cache_set "$rev" ""

Was this hunk intended to be snuck in here? I can understand the
s/cache_set/cache_set_if_unset/ changes, of course, but not this hunk.

>  		fi
>  		return
>  	fi
> @@ -785,7 +806,7 @@ cmd_split () {
>  			# the 'onto' history is already just the subdir, so
>  			# any parent we find there can be used verbatim
>  			debug "  cache: $rev"
> -			cache_set "$rev" "$rev"
> +			cache_set_if_unset "$rev" "$rev"
>  		done
>  	fi
>
> @@ -798,7 +819,7 @@ cmd_split () {
>  		git rev-list --topo-order --skip=1 $mainline |
>  		while read rev
>  		do
> -			cache_set "$rev" ""
> +			cache_set_if_unset "$rev" ""

Okay. A quite interesting question now would be: are there any callers of
`cache_set` left? If so, why?

Thanks,
Dscho

>  		done || exit $?
>  	fi
>
> --
> gitgitgadget
>
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux