[PATCH v3 0/9] Rewrite packfile reuse code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This patch series is rewriting the code that tries to reuse existing
packfiles.

The code in this patch series was written by GitHub, and Peff nicely
provided it in the following discussion:

https://public-inbox.org/git/3E56B0FD-EBE8-4057-A93A-16EBB09FBCE0@xxxxxxxxxxxxxx/

The first versions of this patch series were also discussed:

V2: https://public-inbox.org/git/20191019103531.23274-1-chriscool@xxxxxxxxxxxxx/
V1: https://public-inbox.org/git/20190913130226.7449-1-chriscool@xxxxxxxxxxxxx/

Thanks to the reviewers!

According to Peff this new code is a lot smarter than what it
replaces. It allows "holes" in the chunks of packfile to be reused,
and skips over them. It rewrites OFS_DELTA offsets as it goes to
account for the holes. So it's basically a linear walk over the
packfile, but with the important distinction that we don't add those
objects to the object_entry array, which makes them very lightweight
(especially in memory use, but they also aren't considered bases for
finding new deltas, etc). It seems like a good compromise between the
cost to serve a clone and the quality of the resulting packfile.

Changes since the previous patch series are the following:

  - Rebased onto current master (d9f6f3b619, The first batch post 2.24
    cycle, 2019-11-10)

  - Remove a paragraph in the commit message of patch 3/9 as suggested
    by Jonathan Tan.

  - Improve commit message of patch 9/9 as suggested by Jonathan Tan.

  - Renamed fields of struct reused_chunk in patch 9/9 as suggested by
    Jonathan Tan.

  - Added a few comments in patch 9/9 as suggested by Jonathan Tan.

It could be a good idea if Peff could answer some of the comments made
by Jonathan Tan about patch 9/9.

I have put Peff as the author of all the commits.

Jeff King (9):
  builtin/pack-objects: report reused packfile objects
  packfile: expose get_delta_base()
  ewah/bitmap: introduce bitmap_word_alloc()
  pack-bitmap: don't rely on bitmap_git->reuse_objects
  pack-bitmap: introduce bitmap_walk_contains()
  csum-file: introduce hashfile_total()
  pack-objects: introduce pack.allowPackReuse
  builtin/pack-objects: introduce obj_is_packed()
  pack-objects: improve partial packfile reuse

 Documentation/config/pack.txt |   4 +
 builtin/pack-objects.c        | 243 +++++++++++++++++++++++++++-------
 csum-file.h                   |   9 ++
 ewah/bitmap.c                 |  13 +-
 ewah/ewok.h                   |   1 +
 pack-bitmap.c                 | 178 ++++++++++++++++++-------
 pack-bitmap.h                 |   6 +-
 packfile.c                    |  10 +-
 packfile.h                    |   3 +
 9 files changed, 357 insertions(+), 110 deletions(-)

-- 
2.24.0-rc1




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux