Re: RFC: "negative" dirstat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> Ok, so this is just an RFC patch, but the concept is pretty simple..
>
> In the kernel, the ARM platform is growing boundlessly, with platform
> files being added for every random SoC out there. One of the things
> that Russell King (arm maintainer) has worried about is that when they
> try to clean stuff up, code *removal* also ends up adding to the
> damage, so if ARM ever gets its act together and is able to
> consolidate a lot of this, it's still going to look very bad in the
> statistics because there will be a lot of damage due to removed files.

Are these removal really "remove old cruft, replace with a better version
which does not have much common to removed stuff", or are they more like
"remove N duplicated similar copies of old cruft, refactoring them
properly and the result is used by N callsites"?

The second reason you gave in an earlier discussion why dirstat uses the
damage assessor code was to disregard code movements. It appears to me
that if you spanhash the contents of _all_ files in the preimage and the
postimage of ARM tree and compute literal-added vs src-copied within the
whole tree, I wonder if you can mitigate this "false damage -- because the
refactoring involved code movement across files but within the same
subsystem".

I guess what it boils down to is what you are trying to measure as the
"goodness" value of a change. Adding a lot of Documentation may be good,
adding a lot of "subarchs that do not deserve to be" may be bad, and
moving common logic from one existing subarch to a common file (which
counts towards "literal-added" in that new common file, at the same time
counting towards deletion, i.e. "size - copied", from the original) and
reusing it in a new subarch by simply calling that common infrastructure
is a very good thing. At least, if you count literal-added vs src-copied
across the files within the subarch, instead of doing it per-file, you
would be able to detect the "moving" part more accurately. Of course, you
still cannot tell between good and bad kinds of additions, and you cannot
tell that the new subarch that reuses the result of refactoring by calling
into the refactored code, without understanding the source code, and I
don't think that is within the scope of dirstat.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]