On Tue, Mar 29, 2016 at 10:54 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Stefan Beller <sbeller@xxxxxxxxxx> writes: > >> I thought this is an optimization for C code where you have a diff like: >> >> int existingStuff1(..) { >> ... >> } >> + >> + int foo(..) { >> +... >> +} >> >> int existingStuff2(...) { >> ... >> >> Note that the closing '}' could be taken from the method existingStuff1 instead >> of correctly closing foo. > > That is a less optimal output. Another possible output would be > like so: > > int existingStuff1(..) { > ... > } > > + int foo(..) { > +... > +} > + > int existingStuff2(...) { > > All three are valid output, and ... > >> So the correct heuristic really depends on what kind of text we >> are diffing. > > ... this realization is correct. > > I have a feeling that any heuristic would be correct half of the > time, including the ehuristic implemented in the current code. The > readers of patches have inherent bias. They do not notice when the > hunk is formed to match their expectation, but they will notice and > remember when they see something less optimal. > We have 3 possible diffs: 1) closing brace and newline before the chunk 2) newline before, closing brace after the chunk 3) closing brace and newline after the chunk For C code we may want to conclude that 3) is best. (appeals the bias of most people) 2 is slightly worse, whereas 1) is absolutely worst. Now looking at the code Jacob found strange: > cat > expect <<EOF > + expected results ... > + EOF > +test_expect_failure ... ' > + ... > + ' > + > +cat > expect <<EOF This can be written in two ways: 1) "cat > expect <<EOF" before the diff chunk 2) "cat > expect <<EOF" after the diff chunk We claim 1) is better than 2). This is different from the C code as now we want to have the same lines before not after. To find a heuristic, which appeals both the C code and the shell code, we could take the empty line as a strong hint for the divider: 1) determine the amount of diff which is ambiguous, i.e. can go before or after the chunk. 2) Does the ambiguous part contain an empty line? 3) If not, I have no offer for you, stop. 4) divide the ambiguous chunk by the empty line, 5) put the lines *after* the empty line in front of the chunk 6) put the part before (including) the empty line after the chunk 7) Observe output: > } > > + int foo(..) { > +... > +} > + > int existingStuff2(...) { > test_expect_failure ... ' > existing test ... > ' > > + cat > expect <<EOF > + expected results ... > + EOF > +test_expect_failure ... ' > + ... > + ' > + > cat > expect <<EOF This is what we want in both cases. And I would argue it would appease many other kinds of text as well, because an empty line is usually a strong indicator for any text that a different thing comes along. (Other programming languages, such as Java, C++ and any other C like language behaves that way; even when writing latex figures you'd rather want to break at new lines?) Thanks, Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html