On Thu, Oct 19, 2017 at 1:31 PM, Jeff King <peff@xxxxxxxx> wrote: > For computing moved lines, we feed the characters of each > line into a hash. When we've been asked to ignore > whitespace, then we pick each character using next_byte(), > which returns -1 on end-of-string, which it determines using > the start/end pointers we feed it. > > However our check of its return value treats "0" the same as > "-1", meaning we'd quit if the string has an embedded NUL. I agree. The code looks correct. > This is unlikely to ever come up in practice since our line > boundaries generally come from calling strlen() in the first > place. get_string_hash is called from prepare_entry, which in turn is called from add_lines_to_move_detection or mark_color_as_moved diff_flush_patch_all_file_pairs that constructs the arguments in diff_flush_patch run_diff run_diff_cmd builtin_diff (part "/* Crazy xdl interfaces.. */") xdi_diff_outf( fn_out_consume as arg!) xdi_diff xdl_diff xdl_call_hunk_func -> fn_out_consume(cb, line, len) xdl_call_hunk_func however uses pointer arithmetic instead of strlen. So I think this sentence is not a good idea to put in the commit message. It may not occur in practice, due to binary files detection using NUL as a signal, but conceptually our move-colored(!) diffs should be compatible with NULs with this patch now. > But it was a bit surprising to me as a reader of the > next_byte() code. And it's possible that we may one day feed > this function with more exotic input, which otherwise works > with arbitrary ptr/len pairs. Good point. > > Signed-off-by: Jeff King <peff@xxxxxxxx> > --- > I noticed that we make an extra copy of each line here, just to feed it > to memihash! I guess "-w" is not a critical-performance code path, but > this could be fixed if we could do memhash() incrementally (e.g., by > putting the FNV state into a struct and letting callers "add" to it > incrementally). Maybe an interesting #leftoverbits, though I'd want to > see timing tests that show it's worth doing. > I agree. Thanks, Stefan