Bo Yang wrote: > The algorithm that maps lines from post-image to pre-image is in > the function map_lines. Generally, we use simple line number > calculation method to do the map. > +#define SCALE_FACTOR 4 > +/* > + * [p_start, p_end] represents the pre-image of current diff hunk, > + * [t_start, t_end] represnets the post-image of the current diff hunk, ^^ Typo here ------------------/ > + * [start, end] represents the currently interesting line range in > + * post-image, > + * [o_start, o_end] represents the original line range that coresponds > + * to current line range. > + */ > +void map_lines(long p_start, long p_end, long t_start, long t_end, > + long start, long end, long *o_start, long *o_end) > +{ [...] > + /* > + * A heuristic for lines mapping: > + * > + * When the pre-image is no more than 1/4 of the post-image, > + * there is no effective way to find out which part of pre-image > + * corresponds to the currently interesting range of post-image. > + * And we are in the danger of tracking totally useless lines. > + * So, we just treat all the post-image lines as added from scratch. > + */ > + if (SCALE_FACTOR * (p_end - p_start + 1) < (t_end - t_start + 1)) { So that's what SCALE_FACTOR is good for (and the comment should probably say 1/SCALE_FACTOR instead). Out of curiosity, did you come up with 4 randomly or by experimentation? > +/* > + * When same == 1: > + * [p_start, p_end] represents the diff hunk line range of pre-image, > + * [t_start, t_end] represents the diff hunk line range of post-image. > + * When same == 0, they represents a range of idnetical lines between + * When same == 0, they represent a range of identical lines between > + * two images. > + * > + * This function find out the corresponding line ranges of currently > + * interesting ranges which this diff hunk touches. > + */ > +static void map_range(struct take_range_cb_data *data, int same, > + long p_start, long p_end, long t_start, long t_end) You took some time to comment map_lines, but not this one, sadly. I gather it works as assign_parents_range -> assign_range_to_parent once with map=1, once with map=0 -> either map_range_cb or take_range_cb -> either map_range or take_range but there are few comments on where the decisions should be obvious and where they are just heuristics. Can you add some more comments to enlighten us? > + if (map) > + map_range(&cb, 1, cb.plno + 1, 0x7FFFFFFF, cb.tlno + 1, 0x7FFFFFFF); > + else > + take_range(&cb, cb.plno + 1, 0x7FFFFFFF, cb.tlno + 1, 0x7FFFFFFF); Use INT_MAX from limits.h (and besides, you're not guaranteed to have 32 bits). > + /* > + * Loop on the parents and assign the ranges to different > + * parents, if there is any range left, this commit must > + * be an evil merge. > + */ > + copy = diff_line_range_clone_deeply(r); > + parents = commit->parents; > + while (parents) { > + struct commit *p = parents->item; > + assign_range_to_parent(rev, commit, p, r, &rev->diffopt, 1); IIUC, the latter line is /* assign to the parent what we can */ and the next one > + assign_range_to_parent(rev, commit, p, copy, &rev->diffopt, 0); /* and remove it from our to-be-printed range */ right? If so, please rename the 'copy' variable since its purpose is not to be a copy, but to hold entirely different data. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html