Re: [PATCH 4/4] gc --aggressive: three phase repacking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> On Tue, Mar 18, 2014 at 12:50:50AM -0400, Jeff King wrote:
>
>> On Sun, Mar 16, 2014 at 08:35:04PM +0700, Nguyễn Thái Ngọc Duy wrote:
>> 
>> > As explained in the previous commit, current aggressive settings
>> > --depth=250 --window=250 could slow down repository access
>> > significantly. Notice that people usually work on recent history only,
>> > we could keep recent history more loosely packed, so that repo access
>> > is fast most of the time while the pack file remains small.
>> 
>> One thing I have not seen is real-world timings showing the slowdown
>> based on --depth. Did I miss them, or are we just making assumptions
>> based on one old case from 2009 (that, AFAIK does not have real numbers,
>> just speculation)? Has anyone measured the effect of bumping the delta
>> cache size (and its hash implementation)?
>
> Just as a very quick, rough data point, here are before-and-after
> timings for the patch below doing "git rev-list --objects --all" on my
> linux.git, which is a mix of "--aggressive" and normal packing (I didn't
> do a "repack -f", but it's partially what I've downloaded from k.org and
> what I've repacked in various experiments over the past few months).
>
>   [before]
>   real    0m28.824s
>   user    0m28.620s
>   sys     0m0.232s
>
>   [after]
>   real    0m21.694s
>   user    0m21.544s
>   sys     0m0.172s
>
> The numbers below are completely pulled out of a hat, so we can perhaps
> do even better. But I think it shows that there is room for improvement
> in the delta base cache.
>
> ---
> diff --git a/environment.c b/environment.c
> index c3c8606..73ed670 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -37,7 +37,7 @@ int core_compression_seen;
>  int fsync_object_files;
>  size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
>  size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
> -size_t delta_base_cache_limit = 16 * 1024 * 1024;
> +size_t delta_base_cache_limit = 128 * 1024 * 1024;

You need to change a file in Documentation as well.  Can offer a patch.

>  unsigned long big_file_threshold = 512 * 1024 * 1024;
>  const char *pager_program;
>  int pager_use_color = 1;
> diff --git a/sha1_file.c b/sha1_file.c
> index b37c6f6..a9ab8e3 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1944,7 +1944,7 @@ static void *unpack_compressed_entry(struct packed_git *p,
>  	return buffer;
>  }
>  
> -#define MAX_DELTA_CACHE (256)
> +#define MAX_DELTA_CACHE (1024)

This one really needs experimentation.  I found that increases here lead
to performance degradation rather soon, probably because of decreased
memory locality without significant reduction in cache collisions.  Not
sure whether it's worth touching at all.

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]