[PATCH/RFC 0/6] commit caching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is the cleaned-up version of the commit caching patches I mentioned
here:

  http://article.gmane.org/gmane.comp.version-control.git/212329

The basic idea is to generate a cache file that sits alongside a
packfile and contains the timestamp, tree, and parents in a more compact
and easy-to-access format.

The timings from this one are roughly similar to what I posted earlier.
Unlike the earlier version, this one keeps the data for a single commit
together for better cache locality (though I don't think it made a big
difference in my tests, since my cold-cache timing test ends up touching
every commit anyway).  The short of it is that for an extra 31M of disk
space (~4%), I get a warm-cache speedup for "git rev-list --all" of
~4.2s to ~0.66s.

The big thing it does not (yet) do is use offsets to reference sha1s, as
Shawn suggested.  This would potentially drop the on-disk size from 84
bytes to 16 bytes per commit (or about 6M total for linux.git).

Coupled with using compression level 0 for trees (which do not compress
well at all, and yield only a 2% increase in size when left
uncompressed), my "git rev-list --objects --all" time drops from ~40s to
~25s. Perf reveals that we're spending most of the remaining time in
lookup_object. I've spent a fair bit of time trying to optimize that,
but with no luck; I think it's fairly close to optimal. The problem is
just that we call it a very large number of times, since it is the
mechanism by which we recognize that we have already processed each
sha1.

  [1/6]: csum-file: make sha1write const-correct
  [2/6]: strbuf: add string-chomping functions
  [3/6]: introduce pack metadata cache files
  [4/6]: introduce a commit metapack
  [5/6]: add git-metapack command
  [6/6]: commit: look up commit info in metapack

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]