Re: [PATCH 10/10] name-rev: release unused name strings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/4/2020 4:26 PM, René Scharfe wrote:
> The runtime actually increases slightly from:
> 
> Benchmark #1: ./git -C ../linux/ name-rev --all
>   Time (mean ± σ):     828.8 ms ±   5.0 ms    [User: 797.2 ms, System: 31.6 ms]
>   Range (min … max):   824.1 ms … 838.9 ms    10 runs
> 
> ... to:
> 
> Benchmark #1: ./git -C ../linux/ name-rev --all
>   Time (mean ± σ):     847.6 ms ±   3.4 ms    [User: 807.9 ms, System: 39.6 ms]
>   Range (min … max):   843.4 ms … 854.3 ms    10 runs
> 
> Why is that?  In the Chromium repo, ca. 44000 free(3) calls in
> create_or_update_name() release almost 1GB, while in the Linux repo
> 240000+ calls release a bit more than 5MB, so the average discarded
> name is ca.  1000x longer in the latter.
> 
> Overall I think it's the right tradeoff to make, as it helps curb the
> memory usage in repositories with big discarded names, and the added
> overhead is small.

I agree this trade-off is worth it. Your reasoning for why it is
happening makes sense, too.

> +	if (is_valid_rev_name(name)) {
> +		if (!is_better_name(name, taggerdate, distance, from_tag))
> +			return NULL;
> +
> +		/*
> +		 * This string might still be shared with ancestors
> +		 * (generation > 0).  We can release it here regardless,
> +		 * because the new name that has just won will be better
> +		 * for them as well, so name_rev() will replace these
> +		 * stale pointers when it processes the parents.
> +		 */
> +		if (!name->generation)
> +			free(name->tip_name);
> +	}

And here, this idea of "still be shared with ancestors" is confusing
without the additional context that the name-rev algorithm is using
depth-first-search to find the "best" name. At this point, we are
trying to replace the existing name with a better one, and use
"generation == 0" to declare "I am the initial owner of tip_name".
The rest of the ancestors will replace their tip_name pointer with
the new name, all while not accessing this freed memory.

Keeping such dangling references to freed memory is certainly
dangerous, but these references are short-lived within the name_rev()
method. That limits the possible ways this could cause issues in
the future.

Thanks,
-Stolee




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux