On Tue, May 09, 2017 at 10:49:19AM +0900, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > The specific helpers have less visibility, which is good. The public > > functions a() and b() were already public, so no change. But now the > > common helper is public, too, even though nobody except a() and b() care > > about it. > > > > So it's a tradeoff. And the important question is: is the bleed-over > > between a() and b() worse than the common bits being made public? > > At this point you are saying the same thing as I said ;-) Maybe, but I think my hunch is that the answer to the question is different than yours. ;) > I haven't touched diff.c for a while, but my impression was that it > was already a smallest logical unit, especially since some bits like > diff-lib.c (the interface for the front-end programs to drive the > machinery via the three standard pairs of sources) and ws.c (diff > subsystem to deal with whitespace errors) are already split out (and > needless to say, the diffcore transformations are all in their own > separate files). Over time people may have added what does not > belong there merely for convenience, which may want to get ejected, > but I do not think of many instances of them offhand. Just skimming the file, I suspect diff_filespec could be broken out into a semi-public interface (many of its functions already are public anyway), probably dirstat could be broken out, and possibly other specific formats could be broken into their own files. But I didn't spend much time on it. > It appears that the textconv related helpers (which was where this > discussion thread started from) could be first restructured so that > they do not depend on diff_filespec and turned into a more generic > "I have this path and a blob object name, please run textconv to it > and..." interface and then moved out of the file into its own > textconv.[ch]. It does not need access to quite a few fields in > diff_filespec structure (e.g. it shouldn't care what filemode the > thing has). The diff machinery wants the result in a contiguous > piece of memory and that could be a good starting point, but I said > "and..." above because I am not sure if it is the best generic > thing the restructured interface can do to the result. I think some of the users of textconv do have to go through some contortions to turn their buffers into diff_filespecs (e.g., blame and grep), and it would be nice to fix that. But the diff-driver lookup and caching is tied to the diff_filespec, too (which is perhaps a sign of a misfeature in the first place, as the "diff" attribute is being used for things like "grep" and "cat-file"). So we'd have to resolve that. So I think there is room for improvement there, but I'm also not sure it is worth the trouble. AFAIK neither textconv nor the general length of diff.c is holding up anybody's work or even making it significantly more difficult. If people have spare cycles, they are free to work on anything, of course. But I can think of code that is probably in more dire need of refactoring. :) -Peff