Re: [PATCH] revision.c: reduce object database queries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 27, 2018 at 03:16:58PM -0800, Junio C Hamano wrote:

> >> This code comes originally form 454fbbcde3 (git-rev-list: allow missing
> >> objects when the parent is marked UNINTERESTING, 2005-07-10). But later,
> >> in aeeae1b771 (revision traversal: allow UNINTERESTING objects to be
> >> missing, 2009-01-27), we marked dealt with calling parse_object() on the
> >> parents more directly.
> >>
> >> So what I wonder is whether this code is simply redundant and can go
> >> away entirely. That would save the has_object_file() call in all cases.
> 
> Hmm, interesting. I forgot all what I did around this area, but you
> are right.

I'll leave it to Stolee whether he wants to dig into removing the
has_object_file() call. I think it would do the right thing, but the
most interesting bit would be how it impacts the timings.

> > There's a similar case for trees. ...
> > though technically the existing code allows _missing_ trees, but
> > not on corrupt ones.
> 
> True, but the intention of these "do not care too much about missing
> stuff while marking uninteresting" effort is aligned better with
> ignoring corrupt ones, too, I would think, as "missing" in that
> sentence is in fact about "not availble", and stuff that exists in
> corrupt form is still not available anyway.  So I do not think it
> makes a bad change to start allowing corrupt ones.

Agreed. Here it is in patch form, though as we both said, it probably
doesn't matter that much in practice. So I'd be OK dropping it out of
a sense of conservatism.

-- >8 --
Subject: [PATCH] mark_tree_contents_uninteresting: drop has_object check

It's generally acceptable for UNINTERESTING objects in a
traversal to be unavailable (e.g., see aeeae1b771). When
marking trees UNINTERESTING, we access the object database
twice: once to check if the object is missing (and return
quietly if it is), and then again to actually parse it.

We can instead just try to parse; if that fails, we can then
return quietly. That halves the effort we spend on locating
the object.

Note that this isn't _exactly_ the same as the original
behavior, as the parse failure could be due to other
problems than a missing object: it could be corrupted, in
which case the original code would have died. But the new
behavior is arguably better, as it covers the object being
unavailable for any reason. We'll also still issue a warning
to stderr in such a case.

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
 revision.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/revision.c b/revision.c
index 5ce9b93baa..221d62c52b 100644
--- a/revision.c
+++ b/revision.c
@@ -51,12 +51,9 @@ static void mark_tree_contents_uninteresting(struct tree *tree)
 {
 	struct tree_desc desc;
 	struct name_entry entry;
-	struct object *obj = &tree->object;
 
-	if (!has_object_file(&obj->oid))
+	if (parse_tree_gently(tree, 1) < 0)
 		return;
-	if (parse_tree(tree) < 0)
-		die("bad tree %s", oid_to_hex(&obj->oid));
 
 	init_tree_desc(&desc, tree->buffer, tree->size);
 	while (tree_entry(&desc, &entry)) {
-- 
2.16.2.582.ge2c16ac3c4




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux