Re: Wrong damage counting in diffcore_count_changes?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> This changes it so that
>
>  - whenever it bypasses a destination hash (because it doesn't match a 
>    source), it counts the bytes associated with that as "literal added"
>
>  - at the end (once we have used up all the source hashes), we do the same 
>    thing with the remaining destination hashes.
>
>  - when hashes do match, and we use the difference in counts as a value, 
>    we also use up that destination hash entry (the 'd++'
>
> This _seems_ to make --dirstat output more sensible, and I'd hope that 
> that in turn should mean that file rename detection should also be more 
> sensible. But I haven't actually verified it in any way. Maybe I just 
> screwed up file rename detection entirely.
>
> Did I miss something?

Well, the current loop structure largely comes from your eb4d0e3 (optimize
diffcore-delta by sorting hash entries., 2007-10-02) so you would be the
best judge of the change ;-), even though it seems that the current code
inherited the "skipping of dst when src does not exist" bug from c06c796
(diffcore-rename: somewhat optimized., 2006-03-12).

Earlier code, e.g. as of ba23bbc (diffcore-delta: make change counter to
byte oriented again., 2006-03-04), used to be very simple minded and
inefficient and iterated over src_count[] and dst_count[] arrays for the
entire hash value namespace and there was no such "missing, skipped,
happened to have the next value" issue.

I think you understand the original intention of the function correctly
and from a cursory look of the patch I think it fixes the bug in the
current code, and any changes to renames/breaks should be improvements.

But my lunchbreak is over, and my evening is booked, so I unfortunately
cannot spend more time thinking about any possible fallouts from this
change until tomorrow.

Sorry, and thanks.

> 		Linus
> ---
>  diffcore-delta.c |   11 ++++++++++-
>  1 files changed, 10 insertions(+), 1 deletions(-)
>
> diff --git a/diffcore-delta.c b/diffcore-delta.c
> index e670f85..7cf431d 100644
> --- a/diffcore-delta.c
> +++ b/diffcore-delta.c
> @@ -201,10 +201,15 @@ int diffcore_count_changes(struct diff_filespec *src,
>  		while (d->cnt) {
>  			if (d->hashval >= s->hashval)
>  				break;
> +			la += d->cnt;
>  			d++;
>  		}
>  		src_cnt = s->cnt;
> -		dst_cnt = d->hashval == s->hashval ? d->cnt : 0;
> +		dst_cnt = 0;
> +		if (d->cnt && d->hashval == s->hashval) {
> +			dst_cnt = d->cnt;
> +			d++;
> +		}
>  		if (src_cnt < dst_cnt) {
>  			la += dst_cnt - src_cnt;
>  			sc += src_cnt;
> @@ -213,6 +218,10 @@ int diffcore_count_changes(struct diff_filespec *src,
>  			sc += dst_cnt;
>  		s++;
>  	}
> +	while (d->cnt) {
> +		la += d->cnt;
> +		d++;
> +	}
>  
>  	if (!src_count_p)
>  		free(src_count);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]