Hi, On Wed, 9 Mar 2022 at 21:32, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > The way I read Calvin's suggestion was that you won't allow such a > random series of "git fetch"es without updating the "this is the > filter that is consistent with the contents of this repository" > record, which will lead to inconsistencies. I.e. > > - we must maintain the "filter that is consistent with the contents > of this repository", which this series does not do, but we should. I don't think we should strive to keep this "consistency" — > - the "--refetch" is unnecessary and redundant, as long as such a > record is maintained; when a filter settings changes, we should > do the equivalent of "--refetch" automatically. — we don't know how much data has been pulled in by fetches from different promisor and non-promisor remotes (past & present); or dynamically faulted in through branch switching or history exploration. And I can't see any particular benefit in attempting to keep track of that? Ævar suggested in future maybe we could figure out which commits a user definitively has all the blobs & trees for and refetch could negotiate from that position to improve efficiency: nothing in this series precludes such an enhancement. > ... isn't "git fetch --fitler" that does not update the configured > filter (and does not do a refetch automatically) a bug that made the > "refetch" necessary in the first place? I don't believe it's a bug. A fairly obvious partial clone example I've used before on repos where I want the commit history but not all the associated data (especially when the history is littered with giant blobs I don't care about): git clone example.com/myrepo --filter=blob:none # does a partial clone with no blobs # checkout faults in the blobs present at HEAD in bulk to populate the working tree git config --unset remote.origin.partialclonefilter # going forward, future fetches include all associated blobs for new commits Getting all the blobs for all history is something I'm explicitly trying not to do in this example, but if the next fetch from origin automatically did a "refetch" after I removed the filter that's exactly what would happen. We don't expect users to update `diff.algorithm` in config to run a minimal diff: using the `--diff-algorithm=` option on the command line overrides the config. And the same philosophy applies with fetch: `remote.<name>.partialclonefilter` provides the default filter for fetches, and a user can override it via `git fetch --filter=`. To me this is how Git commands are expected to work. Partial clones are still relatively new and advanced, and I don't believe we should try and over-predict too much what the correct behaviour is for a user. I'd be happy adding something to the documentation for the `remote.<name>.partialclonefilter` config setting to explain that changing or removing the filter won't backfill the local object DB and the user would need `fetch --refetch` for that. Thanks, Rob :)