On Tue, Feb 2, 2021 at 5:02 AM Christian Couder <christian.couder@xxxxxxxxx> wrote: > On Tue, Feb 2, 2021 at 3:01 AM Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: > > On Fri, Jan 29, 2021 at 1:25 PM Charvi Mendiratta <charvi077@xxxxxxxxx> wrote: > > > + merge_*|fixup_*) > > > + action=$(echo "$line" | sed 's/_/ /g');; > > > > What is "merge_" doing here? It doesn't seem to be used by this patch. > > Yeah, it's not used, but it might be a good thing to add this for > consistency while at it. It confuses readers (as it did to me), causing them to waste brain-cycles trying to figure out why it's present. Thus, it would be better to add it when it's actually needed. The waste of brain-cycles and time is especially important on a project like Git for which reviewers and reviewer time are limited resources. > > > +# Copyright (c) 2018 Phillip Wood > > > > Did Phillip write this script? Is this patch based upon an old patch from him? > > Yeah, it might be a good idea to add a "Based-on-patch-by: Phillip ..." Agreed. > > The implementation of test_commit_message() is a bit hard to follow. > > It might be simpler to write it more concisely and directly like this: > > > > git show --no-patch --pretty=format:%B "$1" >actual && > > case "$2" in > > -m) echo "$3" >expect && test_cmp expect actual ;; > > I think we try to avoid many commands on the same line. For something this minor, it's not likely to matter but, of course, it could be split over two lines: -m) echo "$3" >expect && test_cmp expect actual ;; > > *) test_cmp "$2" actual ;; > > esac > > In general I am not sure that using $1, $2, $3 directly makes things > easier to understand, but yeah, with the function documentation that > you suggest, it might be better to write the function using them > directly. The direct $1, $2, etc. was just an example. It's certainly possible to give them names even in the rewritten code I presented. One good reason, however, for just using $1, $2, etc. is that $2 is not well defined; sometimes it's a switch ("-m") and sometimes its a pathname, so it's hard to invent a suitable variable name for it. Also, this function becomes so simple (in the rewritten version) that explicit variable names don't add a lot of value (the cognitive load is quite low because the function is so short). > > Style nit: In Git test scripts, the here-doc body and EOF are indented > > the same amount as the command which opened the here-doc: > > I don't think we are very consistent with this and I didn't find > anything about this in CodingGuidelines. > > In t0008 and t0021 for example, the indentation is more like: > > cat >message <<-EOF && > amend! B > ... > body > EOF > > and I like this style, as it seems clearer than the other styles. I performed a quick survey of the heredoc styles in the tests. Here are the results[1] of my analysis on the 'seen' branch: total-heredocs=4128 same-indent=3053 (<<EOF & body & EOF share indent) cat >expect <<-\EOF body EOF body-eof-indented=24 (body & EOF indented) cat >expect <<-\EOF body EOF body-indented=735 (body indented; EOF not) cat >expect <<-\EOF body EOF left-margin=316 (<<EOF indented; body & EOF not) cat >expect <<\EOF body EOF So, the indentation recommended in my review -- with 3053 instances out of 4128 heredocs -- is by far the most prevalent in the project. [1]: Note that there is a miniscule amount of inaccuracy in the numbers because there are a few cases in which heredocs contain other heredocs, and some scripts build heredocs piecemeal when constructing other scripts, and I didn't bother making my analysis script handle those few cases. The inaccuracy is tiny, thus not meaningful to the overall picture. > > I see that you mirrored the implementation of FAKE_LINES handling of > > "exec" here for "fixup", but the cases are quite different. The > > argument to "exec" is arbitrary and can have any number of spaces > > embedded in it, which conflicts with the meaning of spaces in > > FAKE_LINES, which separate the individual commands in FAKE_LINES. > > Consequently, "_" was chosen as a placeholder in "exec" to mean > > "space". > > > > However, "fixup" is a very different beast. Its arguments are not > > arbitrary at all, so there isn't a good reason to mirror the choice of > > "_" to represent a space, which leads to rather unsightly tokens such > > as "fixup_-C". It would work just as well to use simpler tokens such > > as "fixup-C" and "fixup-c", in which case t/lib-rebase.sh might parse > > them like this (note that I also dropped `g` from the `sed` action): > > > > fixup-*) > > action=$(echo "$line" | sed 's/-/ -/');; > > I agree that "fixup" arguments are not arbitrary at all, but I think > it makes things simpler to just use one way to encode spaces instead > of many different ways. Is that the intention here, though? Is the idea that some day `fixup` will accept arbitrary arguments thus needs to encode spaces? If not, then mirroring the treatment given to `exec` confuses readers into thinking that it will/should accept arbitrary arguments. I brought this up in my review specifically because it was confusing to a person (me) new to this topic and reading the patches for the first time. The more specific and exact the code can be, the less likely it will confuse readers in the future. Anyhow, it's a minor point, not worth expending a lot of time discussing. > > It feels clunky and fragile for this test to be changing > > "expected-message" which was created in the "setup" test and used > > unaltered up to this point. If the content of "expected-message" is > > really going to change from test to test (as I see it changes again in > > a later test), then it would be easier to reason about the behavior if > > each test gives "expected-message" the precise content it should have > > in that local context. As it is currently implemented, it's too > > difficult to follow along and remember the value of "expected-message" > > from test to test. It also makes it difficult to extend tests or add > > new tests in between existing tests without negatively impacting other > > tests. If each test sets up "expected-message" to the precise content > > needed by the test, then both those problems go away. > > Yeah, perhaps the global "expected-message" could be renamed for > example "global-expected-message", and tests which need a specific one > could prepare and use a custom "expected-message" (maybe named > "custom-expected-message") without ever changing > "global-expected-message". That would be fine, though I wondered while reviewing the patch if a global "expect-message" file was even needed since it didn't seem like very many tests used it (but I didn't spend a lot of time counting the exact number of tests due to the high cognitive load tracing how that file might mutate as it passed through each test). Another really good reason for avoiding having later tests depend upon mutations from earlier tests, if possible, is that it makes it easier to run tests selectively with --run or GIT_SKIP_TESTS.