Re: [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ævar,

Thanks for taking the time to look into this,

On Mon, 28 Feb 2022 at 16:54, Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote:
> I realize this was probably based on feedback on v1 (I didn't go back
> and re-read it, sorry).

Yes, `fetch --repair` was from Jonathan Tan's v1 feedback[1], where he
pointed out it could fill in lost objects from any remote in a more
generally useful fashion.

My goal here is to refetch with a different filter so that I get the
outcome of a `clone --filter=` without having to chuck my object
directory. But the actual implementation doesn't need to know anything
specific about filters, so the original "refilter" name I had isn't
really right.

> But I feel strongly that we really should name this something other than
> --repair. I don't care much if it isn't that :) But maybe
> --expand-filters, --fleshen-partial or something like that?

fleshen-partial sounds like a horror movie scene to me.

1. `--refetch`
2. `--as-clone`
3. `--expand-filter` (though TBC you don't necessarily need a filter)
4. `--refilter`
5. something else

> So first (and partially as an aside): Is a "noop" negotiatior really
> want we want at all? Don't we instead want to be discovering those parts
> of our history that are closed under reachability (if any) and say we
> HAVE those things during negotiation?

At an object level we don't have any means of knowing what has or
hasn't been obtained via fetch to a partial clone with different
`--filter` args (via config or cli), dynamic fault-ins, or sourced
from a different remote. Fetch negotiation only occurs for refs and
their associated commits/histories, but filtering occurs at the blob
or tree level — so we often HAVE a commit but not all of its
trees/blobs, whereupon negotiation skips that commit and all it's
associated objects.

> But secondly, on the "--repair" name: The reason I mentioned that is
> that I'd really like us to actually have a "my repo is screwed, please
> repair it".

Feels like people would look at `fsck` for that over `fetch`? Maybe
not. Anyway, I get the point about the naming still not being right
:-)

> But (and I haven't tested, but I'm pretty sure), this patch series isn't
> going to give you that. The reasons are elaborated on in [1], basically
> we try really hard to re-use local data, and due to that & the collision
> detection will often just hard die early in object walking.
>
> But maybe I'm wrong, have you actually tested this with *broken* objects
> as opposed to just missing ones with repo filters + promisors in play?
> Our t/*fsck* and t/*corrupt*/ etc. tests have some of those.

Correct: I haven't tested with such objects/broken ODBs. Ideally
repack/gc/etc would prefer a new-fixed pack over the old-broken
pack/object but that's not really what I'm aiming to achieve here or
am interested in.

1. https://lore.kernel.org/git/20220202185957.1928631-1-jonathantanmy@xxxxxxxxxx/

Thanks,

Rob :)




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux