Re: [PATCH 3/3] Make clear_commit_marks() clean harder

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Mon, 3 Jul 2006, Johannes Schindelin wrote:
> 
> On Mon, 3 Jul 2006, Junio C Hamano wrote:
> 
> > Rene Scharfe <rene.scharfe@xxxxxxxxxxxxxx> writes:
> > 
> > > Don't care if objects have been parsed or not and don't stop when we
> > > reach a commit that is already clean -- its parents could be dirty.
> > 
> > There is something quite wrong with this patch.
> 
> I always had the feeling that it was wrong to traverse not-yet-parsed 
> parents: How could a revision walk possibly come to a certain commit 
> without at least one continuous history of now-parsed objects?

No, that's not the problem.

The problem is that if we unconditionally traverse parents - whether 
parsed or not - any merge will basically result in a 2* expansion of the 
working set: we'll traverse all children twice (whether they meet again or 
not).

So the cost of doign unconditional traversals of parents basically 
approaches 2^n, where 'n' is the number of merges.

Now, the fact that we only traverse parents without adding new ones (and 
the decision on whether it is parsed or not is irrelevant - the only 
relevant part being that we don't parse any _new_ ones) means that each 
commit itself is very cheap to traverse, but O(2^n) ends up meaning that 
even a small constant will eventually be pretty big.

The proper fix is _not_ to add the "object->parsed" check (which is silly, 
wrong, and doesn't fix anything at all), but to add a check for whether 
the object has been seen or not.

In the case of clearing flags, you have two choices:

 - _set_ a new flag ("already cleared"). This would work - once - but is 
   obviously pretty bad.

   This is what we do in all the other cases. We usually call the flag 
   SEEN or similar.

 - depend on the flags being "dense", and saying that we depend on the 
   fact that in order for any of the flags to have been set in the first 
   place, at least _one_ of them needs to be set in the path leading up to 
   that commit.

Now, for the specific case of get_merge_bases(), the flags _are_ dense. 
Individual flags may not be (eg the "UNINTERESTING" flag, whatever it's 
called, will not be dense), but the question of "is _any_ of the flags we 
care about set" _will_ be dense.

As such, adding a

	/* Have we already cleared this? */
	if (!(mask & object->flags))
		return;
	object->flags &= ~mask;

to the traversal function will fix the O(m+2^n) behaviour, and turn the 
traversal into O(m+n) (where "n" is the number of merges, and "m" is the 
total number of commits).

		Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]