[RFC PATCH 00/15] Sparse clones

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This patch series implements some basics for sparse clones, which I
define as a clone where not all blob, tree, or commit objects are
downloaded.  The idea is to include sparseness both relative to span
of files/directories and depth of history, though currently I've only
put effort into span of paths.

This patch is built on pu, because it requires
en/object-list-with-pathspec.

What works:
  * all operations on non-sparse clones (full testsuite passes)
  * clone
  * read-tree
  * ls-files
  * cat-file
  * ls-tree
  * checkout
  * diff
  * status
  * log
  * add (except for not giving errors for paths outside the sparse limits)
  * commit
What doesn't work, yet:
  * Probably everything not tested in the new t572*.sh tests  :-)
  Notable examples of things missing from t572*.sh tests:
  * fetch
  * push
  * merge
  * rebase
  * thin packs (need to modify pack-objects to only delta against
    objects within the sparse limits)
  * densify command (to make a sparse repository non-sparse)
  * "missing" commits (see README file in PATCH1)

Cursory comparison with Nguyễn Thái Ngọc Duy's subtree clone (he's probably
made progress since his last submission, so this may be outdated):
  * His series supports fetch, mine doesn't (yet).
  * His series supports push,  mine doesn't (yet).
  * His series supports merge, mine doesn't (yet).
  * His handling of subtree request over clone/fetch via capabilities
    is probably the right way; I'm pretty sure my adding of sparse
    limits are extra arguments to upload-pack would break backward
    compatibility and be bad.
  * He supports just one selected subtree (though he mentioned he's
    working on extending that); I support arbitrary number of subtrees
    or subfiles.
  * He modifies index format (bumping to header version 4); I don't.
    Perhaps it's necessary for merge handling as I haven't implemented
    that, but at an early glance I don't think it's necessary.
  * While there are some similarities in the low-level details of how
    we've modified the git to avoid missing objects, there are many
    differences as well.  I'm hoping to provoke some good discussion.

Elijah Newren (15):

  P1- README-sparse-clone: Add a basic writeup of my ideas for sparse clones

Just a big old write-up.  Not everything in it is implemented yet, but it
gives you the high-level picture.

  P2- Add tests for client handling in a sparse repository

Tests!  Yaay!

  P3- Read sparse limiting args from $GIT_DIR/sparse-limit

When a sparse clone is created, limiting paths will be stored.

  P4- When unpacking in a sparse repository, avoid traversing missing
    trees/blobs
  P5- read_tree_recursive: Avoid missing blobs and trees in a sparse
    repository
  P6- Automatically reuse sparse limiting arguments in revision walking
  P7- cache_tree_update(): Capability to handle tree entries missing from
    index
  P8- cache_tree_update(): Require relevant tree to be passed

Avoiding missing trees/blobs.  

  P9- Add tests for communication dealing with sparse repositories

Tests for clone/fetch/push/etc.  Just clone so far.

  P10- sparse-repo: Provide a function to record sparse limiting arguments

Can't just read from $GIT_DIR/sparse-limit; gotta write to it too.

  P11- builtin-clone: Accept paths for sparse clone
  P12- Pass extra (rev-list) args on, at least in some cases
  P13- upload-pack: Handle extra rev-list arguments being passed
  P14- EVIL COMMIT: Include all commits
  P15- clone: Ensure sparse limiting arguments are used in subsequent
    operations

I like the changes to how clone accepts additional rev-list arguments
to limit what is downloaded, but I'm not too happy with how these
patches pass those rev-list arguments on to upload-pack.  So don't
bother looking too closely at these.


 Makefile                                   |    2 +
 README-sparse-clone                        |  284 ++++++++++++++++++++++++++++
 builtin/archive.c                          |    2 +-
 builtin/checkout.c                         |    2 +-
 builtin/clone.c                            |   39 +++-
 builtin/commit.c                           |   15 +-
 builtin/fetch-pack.c                       |    3 +-
 builtin/merge.c                            |   19 +-
 builtin/revert.c                           |    7 +-
 builtin/send-pack.c                        |    3 +-
 builtin/write-tree.c                       |    6 +-
 cache-tree.c                               |   92 +++++++++-
 cache-tree.h                               |    4 +-
 cache.h                                    |    5 +-
 connect.c                                  |    9 +-
 diff.h                                     |    1 -
 environment.c                              |    2 +
 merge-recursive.c                          |    6 +-
 merge-recursive.h                          |    2 +-
 revision.c                                 |   21 ++-
 revision.h                                 |    3 +-
 setup.c                                    |    2 +
 sparse-repo.c                              |   84 ++++++++
 sparse-repo.h                              |    4 +
 t/sparse-lib.sh                            |   38 ++++
 t/t5601-clone.sh                           |   14 --
 t/t5720-sparse-repository-basics.sh        |  130 +++++++++++++
 t/t5721-sparse-repository-communication.sh |  106 +++++++++++
 test-dump-cache-tree.c                     |    3 +-
 transport-helper.c                         |    5 +-
 transport.c                                |   13 +-
 transport.h                                |    9 +-
 tree-diff.c                                |    4 +-
 tree-walk.c                                |   48 ++++-
 tree-walk.h                                |    3 +
 tree.c                                     |    5 +
 upload-pack.c                              |   45 +++--
 37 files changed, 952 insertions(+), 88 deletions(-)
 create mode 100644 README-sparse-clone
 create mode 100644 sparse-repo.c
 create mode 100644 sparse-repo.h
 create mode 100644 t/sparse-lib.sh
 create mode 100755 t/t5720-sparse-repository-basics.sh
 create mode 100755 t/t5721-sparse-repository-communication.sh

-- 
1.7.2.3.541.g94cc33

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]