Re: [External] Re: Missing Promisor Objects in Partial Repo Design Doc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Han Young <hanyang.tony@xxxxxxxxxxxxx> writes:
> On Sat, Oct 12, 2024 at 10:05 AM Jonathan Tan <jonathantanmy@xxxxxxxxxx> wrote:
> > So I think we'll need to use do_not_die_on_missing_objects. It does have
> > the weakness that if the object is not supposed to be missing, we don't
> > inform the user, but perhaps this is OK here because we know that all
> > objects we encounter on this object walk are promisor objects, so if
> > it's missing, it's OK.
> 
> And I think users would prefer the git command to succeed if possible,
> rather than die on the first (noncritical) error. Maybe show a warning
> and swallow the error?

Just to be clear, this is not an error condition so we shouldn't show an
error or warning. Whenever we repack non-promisor objects in a partial
clone we will almost always encounter missing objects. In the gc repack,
this is mitigated by --exclude-promisor-objects, which checks the
promisor object set whenever a missing object is encountered; it does
not show an error if the missing object is in that set.

My proposal is to use do_not_die_on_missing_objects instead, which
always tolerates missing objects (without showing any error or warning).
This is probably not safe enough for the gc repack, but should be OK
for the fetch repack, since we are only repacking ancestors of known
promisor objects (so we can deduce that the missing objects are promisor
objects).

> > In addition to do_not_die_on_missing_objects, we'll also need the actual
> > code that stops iteration through objects that pass our "best effort"
> > promisor object check. Probably the best place is in get_revision_1()
> > after the NULL check
> 
> get_revision_1() only does commit limiting though. Some callers of rev-list
> also do tree walking on commits,

Ah, yes, you're right. The repack on fetch is one such caller (that will
need tree walking).

> in a (corrupted) partial repo, tree could
> also be missing. There isn't a central place we can stop tree walking,
> callers using this feature would have to implement "tree walking early
> termination" themself.

The repo could have been cloned with a tree filter (e.g.
"--filter=tree:0") too, in which case trees would be missing even if the
repo is not corrupted. But even in a non-corrupted --filter=blob:none
partial clone, we still don't want to iterate through promisor trees, so
that we don't repack them unnecessarily. So yes, get_revision_1() is not
the only place that needs to be changed.

I think that there is a central place to stop tree walking - in
list-objects.c.







[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux