Re: git fetch --dry-run changes the local repo and transfers data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 27, 2022 at 10:42:25AM -0800, Tim Hockin wrote:

> Thanks.  What threw me is that I expected it to be "very fast" and
> "very cheap" . If I commit a multi-gig file on the server, I see the
> dry-run fetch takes several seconds (clearly indicating some work
> proportional to the server repo size).  I don't want to transfer that
> file on a dry-run, I hoped the server and client were both
> dry-running, andb the server could simply say "here's metadata for
> what I _would have_ returned if this was real".  Not possible?

No, the server has no notion of a dry run.

I think the best you could do with fetch is to ask for a smaller set of
objects. For example:

  git fetch --depth=1 --filter=tree:0 \
    https://github.com/git/git \
    2e71cbbddd64695d43383c25c7a054ac4ff86882

will grab a single object. You can even "git show -s 2e71cbbd" on the
result to see it (the "-s" is important to avoid it fetching the trees
to do a diff!). Two things to be aware of:

  - this may have some lingering effects in your repository, as the
    shallow and partial features store some metadata locally to make
    sense of the situation. You're probably best off doing it in a
    temporary repository.

  - not all servers will support --filter; it has to be enabled in the
    config.

There is potentially a more direct option, though. A while back, commit
a2ba162cda (object-info: support for retrieving object info, 2021-04-20)
added an extension that lets you get the size of an object on the
server. Unfortunately I don't think anybody ever wrote client-side
support. So you'd have to rig up something yourself like:

  # write git's packet format: 4-hex length followed by data
  pkt() {
    printf '%04x%s' "$((4+${#1}))" "$1"
  }

  # a sample input; you should be able to query multiple objects if you
  # want by adding more "oid" lines
  {
    pkt "command=object-info"
    printf "0001"
    pkt "size"
    pkt "oid 2e71cbbddd64695d43383c25c7a054ac4ff86882"
    printf "0000"
  } >input

  # this makes a local request; it's important we're in v2 mode, since
  # the extension only applies there. For http, I think you'd want
  # something like:
  #
  #  curl -H 'Git-Protocol: version=2' https://example.com/repo.git/git-upload-pack
  #
  # but I didn't test it.
  GIT_PROTOCOL=version=2 git-upload-pack /path/to/repo.git <input >output

I've left parsing the output as an exercise for the reader. But you
should be able to notice whether the object is present or not based on
the result.

Not all servers may support the extension. For example, I think GitHub's
servers have disabled it.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux