Re: [PATCH 2/3] upload-pack: skip parse-object re-hashing of "want" objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/7/2022 10:36 AM, Derrick Stolee wrote:
> On 9/6/2022 7:05 PM, Jeff King wrote:

>> There is one thing to note, though: the
>> change in get_reference() affects not just pack-objects, but rev-list,
>> git-log, etc. We could use a flag to limit to index-pack here, but we
>> may already skip hash checks in this instance. For commits, we'd skip
>> anything we load via the commit-graph. And while before this commit we
>> would check a blob fed directly to rev-list on the command-line, we'd
>> skip checking that same blob if we found it by traversing a tree.
> 
> It's worth also mentioning that since you isolated the change to
> get_reference(), it is only skipping the parse for the requested tips
> of the revision walk. As we follow commits and trees to other objects
> we do not use parse_object() but instead use the appropriate method
> (parse_commit(), parse_tree(), and do not even parse blobs).

After writing this, it was bothering me that 'rev-list --verify' should
be checking for corruption throughout the history, not just at the tips.

This is done via the condition in builtin/rev-list.c:finish_object():

static int finish_object(struct object *obj, const char *name, void *cb_data)
{
	struct rev_list_info *info = cb_data;
	if (oid_object_info_extended(the_repository, &obj->oid, NULL, 0) < 0) {
		finish_object__ma(obj);
		return 1;
	}
	if (info->revs->verify_objects && !obj->parsed && obj->type != OBJ_COMMIT)
		parse_object(the_repository, &obj->oid);
	return 0;
}

So this call is the one that is used most-often by the rev-list command,
but isn't necessary to update for the purpose of our desired speedup. This
is another place where we would want to use read_object_file(). (It may even
be the _only_ place, since finish_object() should be called even for the
tip objects.)

In case you think it is valuable to ensure we test both cases ("tip" and
"not tip") I modified your test to have a third commit and test two rev-list
calls:

# An actual bit corruption is more likely than swapped commits, but
# this provides an easy way to have commits which don't match their purported
# hashes, but which aren't so broken we can't read them at all.
test_expect_success 'rev-list --verify-objects notices swapped commits' '
	git init swapped-commits &&
	(
		cd swapped-commits &&
		test_commit one &&
		test_commit two &&
		test_commit three &&
		one_oid=$(git rev-parse HEAD) &&
		two_oid=$(git rev-parse HEAD^) &&
		one=.git/objects/$(test_oid_to_path $one_oid) &&
		two=.git/objects/$(test_oid_to_path $two_oid) &&
		mv $one tmp &&
		mv $two $one &&
		mv tmp $two &&
		test_must_fail git rev-list --verify-objects HEAD &&
		test_must_fail git rev-list --verify-objects HEAD^
	)
'

But this is likely overkill.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux