Re: [PATCH/RFC 0/3] faster inexact rename handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 30, 2007 at 08:38:24AM -0700, Linus Torvalds wrote:

> > with the old and new code. Pairs like Documentation/git-add-script.txt
> > -> Documentation/git-add.txt are not found, because the file is composed
> > almost entirely of boilerplate.
> 
> Ok, that does imply to me that we cannot just drop boilerplate text, 
> because the fact is, lots of files contain boilerplate, but people still 
> think they are "similar".

Well, the problem is that instead of just "dropping" boilerplate text,
we fail to count it as a similarity, but it still counts towards the
file size. It may be that just dropping it totally is the right thing
(in which case those renames _will_ turn up, because they will be filled
with identical non-boilerplate goodness).

> Hmm. I hope that is sufficient. But I suspect it may well not be. 
> Especially since you ignore boiler-plate lines for *some* files but not 
> others (ie it depends on which file you happen to find it in first).

Yes, that part bothers me a little, so I think a "too common, ignore"
overflow flag would at least be better.

But I think the best thing to do now is for me to shut up and see what
the results look like with the tweaks I have mentioned.

-Peff
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux