[PATCH v4 0/7] fetch: add repair: full refetch without negotiation (was: "refiltering")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If a filter is changed on a partial clone repository, for example from
blob:none to blob:limit=1m, there is currently no straightforward way to
bulk-refetch the objects that match the new filter for existing local
commits. This is because the client will report commits as "have" during
fetch negotiation and any dependent objects won't be included in the
transferred pack. Another use case is discussed at [1].

This patch series introduces a --refetch option to fetch & fetch-pack to
enable doing a full fetch without performing any commit negotiation with the
remote, as a fresh clone does. It builds upon cbe566a071 ("negotiator/noop:
add noop fetch negotiator", 2020-08-18).

 * Using --refetch will produce duplicated objects between the existing and
   newly fetched packs, but maintenance will clean them up when it runs
   automatically post-fetch (if enabled).
 * If a user fetches with --refetch applying a more restrictive partial
   clone filter than previously (eg: blob:limit=1m then blob:limit=1k) the
   eventual state is a no-op, since any referenced object already in the
   local repository is never removed. More advanced repacking which could
   improve this scenario is currently proposed at [2].

[1]
https://lore.kernel.org/git/aa7b89ee-08aa-7943-6a00-28dcf344426e@xxxxxxxxxxx/
[2]
https://lore.kernel.org/git/21ED346B-A906-4905-B061-EDE53691C586@xxxxxxxxx/

Changes since v3:

 * Mention fetch --refetch in the remote.<name>.partialclonefilter
   documentation.

Changes since v2:

 * Changed the name from "repair" to "refetch". While it's conceivable to
   use it in some object DB repair situations that's not the focus of these
   changes.
 * Pass config options to maintenance via GIT_CONFIG_PARAMETERS
 * Split out auto-maintenance to a separate & more robust test
 * Minor fixes/improvements from reviews by Junio & Ævar

Changes since RFC (v1):

 * Changed the name from "refilter" to "repair"
 * Removed dependency between server-side support for filtering and repair
 * Added a test case for a shallow clone
 * Post-fetch auto maintenance now strongly encourages
   repacking/consolidation

Reviewed-by: Calvin Wan calvinwan@xxxxxxxxxx

Robert Coup (7):
  fetch-negotiator: add specific noop initializer
  fetch-pack: add refetch
  builtin/fetch-pack: add --refetch option
  fetch: add --refetch option
  t5615-partial-clone: add test for fetch --refetch
  fetch: after refetch, encourage auto gc repacking
  docs: mention --refetch fetch option

 Documentation/config/remote.txt           |  6 +-
 Documentation/fetch-options.txt           | 10 +++
 Documentation/git-fetch-pack.txt          |  4 ++
 Documentation/technical/partial-clone.txt |  3 +
 builtin/fetch-pack.c                      |  4 ++
 builtin/fetch.c                           | 34 +++++++++-
 fetch-negotiator.c                        |  5 ++
 fetch-negotiator.h                        |  8 +++
 fetch-pack.c                              | 46 ++++++++-----
 fetch-pack.h                              |  1 +
 remote-curl.c                             |  6 ++
 t/t5616-partial-clone.sh                  | 81 ++++++++++++++++++++++-
 transport-helper.c                        |  3 +
 transport.c                               |  4 ++
 transport.h                               |  4 ++
 15 files changed, 197 insertions(+), 22 deletions(-)


base-commit: abf474a5dd901f28013c52155411a48fd4c09922
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1138%2Frcoup%2Frc-partial-clone-refilter-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1138/rcoup/rc-partial-clone-refilter-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1138

Range-diff vs v3:

 1:  96a75be3d8a = 1:  6cd6d4a59f6 fetch-negotiator: add specific noop initializer
 2:  04ca6a07f85 = 2:  03f0de3d28c fetch-pack: add refetch
 3:  879d30c4473 = 3:  f7942344ff8 builtin/fetch-pack: add --refetch option
 4:  a503b98f333 = 4:  78501bbf281 fetch: add --refetch option
 5:  01f22e784a5 = 5:  6c17167ac1e t5615-partial-clone: add test for fetch --refetch
 6:  31046625987 = 6:  28c07219fd8 fetch: after refetch, encourage auto gc repacking
 7:  f923a06aab5 ! 7:  da1e6de7a9f doc/partial-clone: mention --refetch fetch option
     @@ Metadata
      Author: Robert Coup <robert@xxxxxxxxxxx>
      
       ## Commit message ##
     -    doc/partial-clone: mention --refetch fetch option
     +    docs: mention --refetch fetch option
      
     -    Document it for partial clones as a means to apply a new filter.
     +    Document it for partial clones as a means to apply a new filter, and
     +    reference it from the remote.<name>.partialclonefilter config parameter.
      
          Signed-off-by: Robert Coup <robert@xxxxxxxxxxx>
      
     + ## Documentation/config/remote.txt ##
     +@@ Documentation/config/remote.txt: remote.<name>.promisor::
     + 	objects.
     + 
     + remote.<name>.partialclonefilter::
     +-	The filter that will be applied when fetching from this
     +-	promisor remote.
     ++	The filter that will be applied when fetching from this	promisor remote.
     ++	Changing or clearing this value will only affect fetches for new commits.
     ++	To fetch associated objects for commits already present in the local object
     ++	database, use the `--refetch` option of linkgit:git-fetch[1].
     +
       ## Documentation/technical/partial-clone.txt ##
      @@ Documentation/technical/partial-clone.txt: Fetching Missing Objects
         currently fetches all objects referred to by the requested objects, even

-- 
gitgitgadget



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux