Re: [PATCHv2 3/3] Teach --dirstat to not completely ignore rearranged lines within a file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday 11 April 2011, Junio C Hamano wrote:
> Johan Herland <johan@xxxxxxxxxxx> writes:
> > Currently, the --dirstat analysis fails to detect when lines within a
> > file are rearranged, because the "damage" calculated by show_dirstat()
> > is 0. However, if the SHA1 sum has changed, we already now that there
> > should be at least some minimum amount of damage.
> 
> This logic is sensible, modulo that "fails to detect" is actually
> "ignores mere line movements on purpose".

I apologize for my commit message not having caught up with discussion 
around this issue. I came into the discussion from the POV of "--dirstat 
does not pick up what --stat picks up; there must be a bug in --dirstat", 
and my original objective was therefore to "fix" --dirstat to be "more like 
--stat". Obviously, I now know exactly why --dirstat is different, and that 
we don't want to fundamentally change it. My commit message should have been 
rephrased in a more positive light as a result. Feel free to fix before 
applying.

> In any case, if the object names are different, we already know that
> there is _some_ damage, and it is very unintiutive to claim that there
> is _no_ damage.

Agreed.

> > This patch teaches show_dirstat() to assign a minimum amount of damage
> > (== 1) to entries for which the analysis otherwise yields zero damage.
> 
> So it is perfectly in line with the above logic to give a minimum here.
> Zero was simply just unintuitive, and this is a good fix to the problem.
> 
> > Obviously this is not a complete fix, but it's at least better to
> 
> I however do not understand what "a complete fix" means in this context.
> You've fixed the unintuitiveness, and as far as the description in the
> introductory paragraph of the problem goes, I think this already is a
> complete fix.

I still feel that a file with 1000 rearranged lines should somehow count 
"more" than a file with only 1 rearranged line, but it's hard to get there 
without futzing with diffcore_count_changes(), probably making the whole 
thing considerably more expensive... So in that sense, I agree that the 
current solution is probably as complete as we can get.


...Johan

-- 
Johan Herland, <johan@xxxxxxxxxxx>
www.herland.net
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]