On Tue, Mar 29, 2016 at 11:16 AM, Stefan Beller <sbeller@xxxxxxxxxx> wrote: > On Tue, Mar 29, 2016 at 10:54 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: >> Stefan Beller <sbeller@xxxxxxxxxx> writes: >> >>> I thought this is an optimization for C code where you have a diff like: >>> >>> int existingStuff1(..) { >>> ... >>> } >>> + >>> + int foo(..) { >>> +... >>> +} >>> >>> int existingStuff2(...) { >>> ... >>> >>> Note that the closing '}' could be taken from the method existingStuff1 instead >>> of correctly closing foo. >> >> That is a less optimal output. Another possible output would be >> like so: >> >> int existingStuff1(..) { >> ... >> } >> >> + int foo(..) { >> +... >> +} >> + >> int existingStuff2(...) { >> >> All three are valid output, and ... >> >>> So the correct heuristic really depends on what kind of text we >>> are diffing. >> >> ... this realization is correct. >> >> I have a feeling that any heuristic would be correct half of the >> time, including the ehuristic implemented in the current code. The >> readers of patches have inherent bias. They do not notice when the >> hunk is formed to match their expectation, but they will notice and >> remember when they see something less optimal. >> > > We have 3 possible diffs: > 1) closing brace and newline before the chunk > 2) newline before, closing brace after the chunk > 3) closing brace and newline after the chunk > > For C code we may want to conclude that 3) is best. (appeals the bias of > most people) 2 is slightly worse, whereas 1) is absolutely worst. > > Now looking at the code Jacob found strange: > >> cat > expect <<EOF >> + expected results ... >> + EOF >> +test_expect_failure ... ' >> + ... >> + ' >> + >> +cat > expect <<EOF > > This can be written in two ways: > > 1) "cat > expect <<EOF" before the diff chunk > 2) "cat > expect <<EOF" after the diff chunk > > We claim 1) is better than 2). > This is different from the C code as now we want to have the > same lines before not after. > > To find a heuristic, which appeals both the C code > and the shell code, we could take the empty line > as a strong hint for the divider: > > 1) determine the amount of diff which is ambiguous, i.e. can > go before or after the chunk. > 2) Does the ambiguous part contain an empty line? > 3) If not, I have no offer for you, stop. > 4) divide the ambiguous chunk by the empty line, > 5) put the lines *after* the empty line in front of the chunk > 6) put the part before (including) the empty line after the > chunk > 7) Observe output: > >> } >> >> + int foo(..) { >> +... >> +} >> + >> int existingStuff2(...) { > >> test_expect_failure ... ' >> existing test ... >> ' >> >> + cat > expect <<EOF >> + expected results ... >> + EOF >> +test_expect_failure ... ' >> + ... >> + ' >> + >> cat > expect <<EOF > > This is what we want in both cases. > And I would argue it would appease many other kinds of text as well, because > an empty line is usually a strong indicator for any text that a > different thing comes along. > (Other programming languages, such as Java, C++ and any other C like > language behaves > that way; even when writing latex figures you'd rather want to break > at new lines?) > > Thanks, > Stefan This seems like a good heuristic. Can we think of any examples where it would produce wildly confusing diffs? I don't think it necessarily needs to be default but just a possible option when formatting diffs, much like we already have today. Thanks, Jake -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html