On Thu, Aug 29, 2024 at 6:03 AM Patrick Steinhardt <ps@xxxxxx> wrote: > On Thu, Aug 29, 2024 at 05:16:24AM -0400, Eric Sunshine wrote: > > - push(@{$self->{parser}->{problems}}, ['UNCLOSED-HEREDOC', $tag]); > > + push(@{$self->{parser}->{problems}}, ['HEREDOC', $tag]); > > $$b =~ /(?:\G|\n).*\z/gc; # consume rest of input > > my $body = substr($$b, $start, pos($$b) - $start); > > $self->{lineno} += () = $body =~ /\n/sg; > > I was wondering why this is being changed here, as I found the old name > to be easier to understand. Then I saw further down that you essentially > use those as identifiers for the actual problem. Peff chose[1] the longer "UNCLOSED-HEREDOC" over the (perhaps too) terse "HERE" I had chosen[2], however, now that this is an internal detail of the script -- not part of the user-facing output -- such verbosity is unneeded. As programmers, just as we choose shorter variable names (say, "i" instead of "record_index" in a for-loop), I find "HEREDOC" easier to read in a code context than the longer "UNCLOSED-HEREDOC", hence this (admittedly unnecessary) change. [1]: https://lore.kernel.org/git/20230330193031.GC27989@xxxxxxxxxxxxxxxxxxxxxxx/ [2]: https://lore.kernel.org/git/CAPig+cQiOGrDSUc34jHEBp87Rx-dnXNcPcF76bu0SJoOzD+1hw@xxxxxxxxxxxxxx/ > Is there a specific reason why we now have the separate translation > step? Couldn't we instead push the translated message here, directly? I considered that but, although this instance is a simple "push" operation, some heuristics scan and modify the `problems` array by looking for and removing specific items. There are numerous instances in (older) scripts similar to this: if condition not satisified then echo it did not work... echo failed! return 1 fi which prints an error message and then explicitly signals failure with `return 1` (or `exit 1` or `false`) as the final command in an `if` branch or `case` arm. In these cases, the tests don't bother maintaining the &&-chain between `echo` and the explicit "test failed" indicator. As chainlint processes the token stream, it correctly pushes "AMP" annotations onto the `problems` array for each of the `echo` lines, but when it encounters the explicit `return 1`, the heuristic kicks in and notices that the broken &&-chain leading up to `return 1` is immaterial since the construct is manually signaling failure, thus the &&-chain breakage is legitimate and safe. Requiring test authors to add "&&" to each such line would just be making busy-work for them. Hence, the heuristic actively removes the preceding "AMP" annotations from `problems`. For the removal, it's easier to search `problems` for a simple token such as "AMP" than to search for a user-facing message such as "ERR missing '?!'". > > -8 bar=$((42 + 1)) ?!AMP?! > > +8 bar=$((42 + 1)) ?!ERR missing '&&'?! > > I find the resulting error messages a bit confusing: to me it reads as > if "ERR" is missing the ampersands. Is it actually useful to have the > ERR prefix in the first place? We do not output anything but errors, so > it feels somewhat redundant. As you mentioned in your review of [2/2], the "ERR" prefix serves as a useful target for searches in a terminal. Regarding possible confusion, my first draft placed a colon after the prefix, i.e.: ERR: missing '&&' but it seemed unnecessarily noisy, so I dropped the colon since: ERR missing '&&' seemed clear enough. However, I don't feel too strongly about it and can add the colon back if people think it would make the message clearer.