Re: [PATCH 0/5] Suggested for PU: revision caching system to significantly speed up packing/walking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Thu, 6 Aug 2009, Nick Edelen wrote:

> SUGGESTED FOR 'PU':
> 
> Traversing objects is currently very costly, as every commit and tree must be 
> loaded and parsed.  Much time and energy could be saved by caching metadata and 
> topological info in an efficient, easily accessible manner.  Furthermore, this 
> could improve git's interfacing potential, by providing a condensed summary of 
> a repository's commit tree.
> 
> This is a series to implement such a revision caching mechanism, aptly named 
> rev-cache.  The series will provide:
>  - a core API to manipulate and traverse caches
>  - an integration into the internal revision walker
>  - a porcelain front-end providing access to users and (shell) applications
>  - a series of tests to verify/demonstrate correctness
>  - documentation of the API, porcelain and core concepts
> 
> In cold starts rev-cache has sped up packing and walking by a factor of 4, and 
> over twice that on warm starts.  Some times on slax for the linux repository:
> 
> rev-list --all --objects >/dev/null
>  default
>    cold    1:13
>    warm    0:43
>  rev-cache'd
>    cold    0:19
>    warm    0:02
> 
> pack-objects --revs --all --stdout >/dev/null
>  default
>    cold    2:44
>    warm    1:21
>  rev-cache'd
>    cold    0:44
>    warm    0:10

Nice!

> The mechanism is minimally intrusive: most of the changes take place in 
> seperate files, and only a handful of git's existing functions are 
> modified.

Sorry, I forgot the details, could you quickly remind me why these caches 
are not in the pack index files?

>  Documentation/rev-cache.txt           |   51 +
>  Documentation/technical/rev-cache.txt |  336 ++++++
>  Makefile                              |    2 +
>  blob.c                                |    1 +
>  blob.h                                |    1 +
>  builtin-rev-cache.c                   |  284 +++++
>  builtin.h                             |    1 +
>  commit.c                              |    3 +
>  commit.h                              |    2 +
>  git.c                                 |    1 +
>  list-objects.c                        |   49 +-
>  rev-cache.c                           | 1832 +++++++++++++++++++++++++++++++++
>  revision.c                            |   89 ++-
>  revision.h                            |   46 +-
>  t/t6015-rev-cache-list.sh             |  228 ++++
>  t/t6015-sha1-dump-diff.py             |   36 +

Hmpf.

We got rid of the last Python script in Git a long time ago, but now two 
different patch series try to sneak that dependency (at least for testing) 
back in.

That's all the worse because we cannot use Python in msysGit, and Windows 
should be a platform benefitting dramatically from your work.

Ciao,
Dscho

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]