On Thu, May 18, 2017 at 4:33 PM, Jonathan Tan <jonathantanmy@xxxxxxxxxx> wrote: > I know I suggested the paragraph above, but after rereading your patch > set, I think I finally understand what you're trying to accomplish. > I think it's better to combine patches 4/20, 5/20, and 6/20, with the > following commit message: and patch 7/20 (diff.c: inline emit_line_0 into emit_line) > diff: introduce more flexible emit function > > Currently, diff output is written either through the emit_line_0 > function or through the FILE * in struct diff_options directly. To > make it easier to teach diff to buffer its output (which will be done > in a subsequent commit), introduce a more flexible emit() function. In > this commit, direct usages of emit_line_0() are replaced with emit(); > subsequent commits will also replace usages of the FILE * with emit(). ok. sounds reasonable to me. I kept them separate for easier review originally, but I can certainly combine them again. > > And the function itself can be documented this way (with the appropriate > formatting): I have some documentation in 19/20 in diff.h for the data structure stored. We'd want to discuss where to put the documentation. I'd not like to add such documentation to some static function in diff.c but rather only document the data structure, and then (after in-lining <...>_0 to emit_line) keep the function documentation rather short as: /* see struct buffered_patch_line */ static void emit_line(...lots of args...) { /* work on said data structure */ ... > If you do all that, then the buffering patch (19/20) can be improved by > adding this comment somewhere in the file: > > Buffer the diff output into ??? instead of immediately writing it to > "file". > > NEEDSWORK: The contents of the ??? array - in particular, how the diff > output is divided into array elements - is not precisely defined; some > functions may emit a line all at once (resulting in one element) > whereas some others may emit a line piecemeal (resulting in more than > one element). Ideally, the code in this file should be structured so > that we do not have such imprecision, but in the meantime, callers > that request buffering should ensure that the diff output is divided > the way they expect (and have tests to ensure that it remains so). I want to rename that struct to "buffered_diff_piece" or "partial_diff_output", not implying we have a line or other another specific piece. Rethinking the discussion with Junio, what the correct abstraction level is, we could even name it "one_color_piece", or "colored_diff_part", to imply we'd want to have as much content as possible in the same color. (And line endings always being in RESET color, we'd have a natural separation at EOL) > >> With this patch other callers hard code the sign (which are '+', '-', >> ' ' and '\\') such that we do not run into unexpectedly emitting an >> erroneous '\0'. > > I still don't understand this paragraph - can you rewrite this in the > imperative tense? As in this patch the function signature is the same, but just changes meaning of one of the arguments, one could imagine that there could be a caller with first='\0' given some interesting data/mode of operation, which would have instructed to literally output a '\0'. We would not do this any more as the meaning changed. However this hypothetical subtle bug was not introduced, because there are no callers that call the function with a 'first' depending on user provided data or otherwise hard-coded expectation of a '\0' output. Maybe I'll just drop this part.