Re: [PATCH] clone: support cloning of filtered bundles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nikolay

On 14/01/2024 19:39, Nikolay Edigaryev wrote:
Hello Phillip,

As I understand it if you're cloning from a bundle file then then there
is no remote so how can we set a remote-specific config?

There is a remote, for more details see df61c88979 (clone: also
configure url for bare clones, 2010-03-29), which has the following
code:

strbuf_addf(&key, "remote.%s.url", remote_name);
git_config_set(key.buf, repo);
strbuf_reset(&key);

You can verify this by creating a bundle on Git 2.43.0 with "git create
bundle bundle.bundle --all" and then cloning it with "git clone
--bare /path/to/bundle.bundle", in my case the following repo-wide
configuration file was created:

[core]
repositoryformatversion = 0
filemode = true
bare = true
ignorecase = true
precomposeunicode = true
[remote "origin"]
url = /Users/edi/src/cirrus-cli/cli.bundle

Oh, thanks for clarifying that I didn't realize we set "origin" to point to the bundle. That means this patch creates a promisor remote config pointing to a bundle that does not contain the missing objects. As Junio said that doesn't make much sense to me as the point of the promisor config is to allow git to lazily fetch the missing objects.

Best Wishes

Phillip

I'm surprised that the proposed change does not require the user to pass
"--filter" to "git clone" as I expected that we'd want to check that the
filter on the command line was compatible with the filter used to create
the bundle. Allowing "git clone" to create a partial clone without the
user asking for it by passing the "--filter" option feels like is going
to end up confusing users.

Note that currently, when you clone a normal non-filtered bundle with a
'--filter' argument specified, no filtering will take place and no error
will be thrown. "promisor = true" and "partialclonefilter = ..." options
will be set in the repo config, but no .promisor file will be created.
This is even more confusing IMO, but that's how it currently on
Git 2.43.0.

You have a good point, but I feel like completely preventing cloning of
filtered bundles and requiring a '--filter' argument is very taxing. If
you've already specified a '--filter' when creating a bundle (and thus
your intent to use partially cloned data), why do it multiple times?

What I propose as an alternative here is to act based on the user's
intent when cloning:

* when the user specifies no '--filter' argument, do nothing special,
    allow cloning both types of bundles: normal and filtered (with the
    logic from this patch)

* when the user does specify a '--filter' argument, either:
   * throw an error explaining that filtering of filtered bundles is not
     supported
   * or compare the user's filter specification and the one that is
     in the bundle and only throw an error if they mismatch

Let me know what you think about this (and perhaps you have a more
concrete example in mind where this will have negative consequences)
and I'll be happy to do a next iteration.


On Sun, Jan 14, 2024 at 10:00 PM Phillip Wood <phillip.wood123@xxxxxxxxx> wrote:

Hi Nikolay

On 14/01/2024 11:16, Nikolay Edigaryev via GitGitGadget wrote:
From: Nikolay Edigaryev <edigaryev@xxxxxxxxx>

f18b512bbb (bundle: create filtered bundles, 2022-03-09) introduced an
incredibly useful ability to create filtered bundles, which advances
the partial clone/promisor support in Git and allows for archiving
large repositories to object storages like S3 in bundles that are:

* easy to manage
    * bundle is just a single file, it's easier to guarantee atomic
      replacements in object storages like S3 and they are faster to
      fetch than a bare repository since there's only a single GET
      request involved
* incredibly tiny
    * no indexes (which may be more than 10 MB for some repositories)
      and other fluff, compared to cloning a bare repository
    * bundle can be filtered to only contain the tips of refs neccessary
      for e.g. code-analysis purposes

However, in 86fdd94d72 (clone: fail gracefully when cloning filtered
bundle, 2022-03-09) the cloning of such bundles was disabled, with a
note that this behavior is not desired, and it the long-term this
should be possible.

The commit above states that it's not possible to have this at the
moment due to lack of remote and a repository-global config that
specifies an object filter, yet it's unclear why a remote-specific
config can't be used instead, which is what this change does.

As I understand it if you're cloning from a bundle file then then there
is no remote so how can we set a remote-specific config?

I'm surprised that the proposed change does not require the user to pass
"--filter" to "git clone" as I expected that we'd want to check that the
filter on the command line was compatible with the filter used to create
the bundle. Allowing "git clone" to create a partial clone without the
user asking for it by passing the "--filter" option feels like is going
to end up confusing users.

+test_expect_success 'cloning from filtered bundle works' '
+     git bundle create partial.bdl --all --filter=blob:none &&
+     git clone --bare partial.bdl partial 2>err

The redirection hides any error message which will make debugging test
failures harder. It would be nice to see this test check any config set
when cloning and that git commands can run successfully in the repository.

Best Wishes

Phillip




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux