On Mon, May 08, 2017 at 10:02:58AM +0900, Junio C Hamano wrote: > Stefan Beller <sbeller@xxxxxxxxxx> writes: > > > I guess it is ok for now and in this series, but we may want > > to split up diff.[ch] in the future into multiple finer grained files. > > For what end? Such a split would require more symbols internal to > diff.[ch] to become external, which is a big downside, so we need to > have a large reward to compensate it if we were to go there. I think there are two sides to that coin. Let's say you have a file with five functions (and you can replace function with structs, global variables, etc; any unit of complexity that might or might not be externally visible): /* called by both a() and b() */ static void a_and_b_helper(); /* called by a() */ static void a_helper(); void a(); /* called by b() */ static void b_helper(); void b(); When they are in the same file, b() and b_helper() can see a_helper(), even though they don't need it. And that means increased complexity in dealing with a_helper(), because its visibility is larger than necessary. We have to worry about what a change to it might do to b() (or more realistically, c(), d(), etc). If we split this apart, we end up with three files: common.c: void a_and_b_helper(); a.c: static void a_helper(); void a(); b.c: static void b_helper(); void b(); The specific helpers have less visibility, which is good. The public functions a() and b() were already public, so no change. But now the common helper is public, too, even though nobody except a() and b() care about it. So it's a tradeoff. And the important question is: is the bleed-over between a() and b() worse than the common bits being made public? That can't be answered without looking at how many distinct "a" and "b"-like chunks there are in the file, and what the common bits look like. I'm not sure of the answer for diff.c. Without digging, both ends of the spectrum seem equally plausible to me: that it is mostly a set of N distinct units which could be split apart, or that it really is a few public functions calling the same common core over and over. And a follow-on question is what we can do to mitigate the cost of making the common code public. We could advertise a_and_b_helper() only in diff-internal.h, and then makes it only semi-public (anybody can include that, of course, but including diff-internal.h seems like it ought to be a red flag). That only helps the programmer, though; we'd still be losing out on compiler optimizations and static analysis that only looks at one translation unit at a time. Phew. That ended up a little long-winded. All of it is to say that I don't think it makes sense to reject a split out-of-hand, but nor is it always a good thing. It depends on what's in the file. -Peff