Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > Ok, so this is just an RFC patch, but the concept is pretty simple.. > > In the kernel, the ARM platform is growing boundlessly, with platform > files being added for every random SoC out there. One of the things > that Russell King (arm maintainer) has worried about is that when they > try to clean stuff up, code *removal* also ends up adding to the > damage, so if ARM ever gets its act together and is able to > consolidate a lot of this, it's still going to look very bad in the > statistics because there will be a lot of damage due to removed files. Are these removal really "remove old cruft, replace with a better version which does not have much common to removed stuff", or are they more like "remove N duplicated similar copies of old cruft, refactoring them properly and the result is used by N callsites"? The second reason you gave in an earlier discussion why dirstat uses the damage assessor code was to disregard code movements. It appears to me that if you spanhash the contents of _all_ files in the preimage and the postimage of ARM tree and compute literal-added vs src-copied within the whole tree, I wonder if you can mitigate this "false damage -- because the refactoring involved code movement across files but within the same subsystem". I guess what it boils down to is what you are trying to measure as the "goodness" value of a change. Adding a lot of Documentation may be good, adding a lot of "subarchs that do not deserve to be" may be bad, and moving common logic from one existing subarch to a common file (which counts towards "literal-added" in that new common file, at the same time counting towards deletion, i.e. "size - copied", from the original) and reusing it in a new subarch by simply calling that common infrastructure is a very good thing. At least, if you count literal-added vs src-copied across the files within the subarch, instead of doing it per-file, you would be able to detect the "moving" part more accurately. Of course, you still cannot tell between good and bad kinds of additions, and you cannot tell that the new subarch that reuses the result of refactoring by calling into the refactored code, without understanding the source code, and I don't think that is within the scope of dirstat. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html