Re: [PATCH v2 2/8] revision: learn '--no-kept-objects'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 03, 2021 at 10:58:57PM -0500, Taylor Blau wrote:

> @@ -3797,6 +3807,11 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
>  		return commit_ignore;
>  	if (revs->unpacked && has_object_pack(&commit->object.oid))
>  		return commit_ignore;
> +	if (revs->no_kept_objects) {
> +		if (has_object_kept_pack(&commit->object.oid,
> +					 revs->keep_pack_cache_flags))
> +			return commit_ignore;
> +	}

OK, so this has the same "problems" as --unpacked, which is that we can
miss some objects (i.e., things that are reachable but not-kept may not
be reported). But it should be OK in this version of the series, because
we will not be relying on it for selection of objects, but only to fill
in ordering / namehash fields.

Should we warn people about that, either as a comment or in the commit
message?

> +--no-kept-objects[=<kind>]::
> +	Halts the traversal as soon as an object in a kept pack is
> +	found. If `<kind>` is `on-disk`, only packs with a corresponding
> +	`*.keep` file are ignored. If `<kind>` is `in-core`, only packs
> +	with their in-core kept state set are ignored. Otherwise, both
> +	kinds of kept packs are ignored.

Likewise, I wonder whether we need to expose this mode to users.
Normally I'm a fan of doing so, because it allows scripted callers
access to more of the internals, but:

  - the semantics are kind of weird about where we draw the line between
    performance and absolute correctness

  - the "in-core" thing is a bit weird for callers of rev-list; how do I
    as a caller mark a pack as kept-in-core? I think it's only an
    internal pack-objects thing.

Once we support this in rev-list, we'll have to do it forever (or deal
with deprecation, etc). If we just need it internally, maybe it's wise
to leave it as a something you ask for by manipulating rev_info
directly. Or perhaps leave it as an undocumented interface we use for
testing, and not something we promise to keep working.

> --- a/list-objects.c
> +++ b/list-objects.c
> @@ -338,6 +338,13 @@ static void traverse_trees_and_blobs(struct traversal_context *ctx,
>  			ctx->show_object(obj, name, ctx->show_data);
>  			continue;
>  		}
> +		if (ctx->revs->no_kept_objects) {
> +			struct pack_entry e;
> +			if (find_kept_pack_entry(ctx->revs->repo, &obj->oid,
> +						 ctx->revs->keep_pack_cache_flags,
> +						 &e))
> +				continue;
> +		}

This hunk is interesting.

There is no similar check for revs->unpacked in list-objects.c to cut
off the traversal. And indeed, running "rev-list --unpacked" will
generally look at the _whole_ tree for a commit that is unpacked, even
if all of the tree entries are packed. That's something we might
consider changing in the name of performance (though it does increase
the number of cases where --unpacked will fail to find an unpacked but
reachable object).

But this is a funny place to put it. If I understand it correctly, it is
cutting off the traversal at the very top of the tree. I.e., if we had a
commit that is not-kept, we'd queue it's root tree. And then we might
find that the root tree is kept, and avoid traversing it. But if we _do_
traverse it, we would look at every subtree it contains, even if they
are kept! That's because we recurse the tree via the recursive
process_tree(), not by queueing more objects in the pending array here.

So this check seems to exist in a funny middle ground. I think it's
unlikely to catch anything useful (usually commits have a unique root
tree; it's all of the untouched parts of the subtrees that will be in
the kept packs). IMHO we should either drop it (and act like
"--unpacked", accepting that we may traverse some extra tree objects),
or we should go all-in on performance and cut it off in the top of
process_tree().

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux