Hi Eric, On Tue, Mar 16, 2021 at 02:37:16AM -0400, Eric Sunshine wrote: > On Mon, Mar 15, 2021 at 8:57 PM Denton Liu <liu.denton@xxxxxxxxx> wrote: > > In the previous few commits, we sorted many lists into ASCII-order. In > > order to ensure that they remain that way, add the 'check-sort' target. > > [...] > > Signed-off-by: Denton Liu <liu.denton@xxxxxxxxx> > > --- > > +my @regexes = map { qr/^$_/ } @ARGV; > > +my $last_regex = 0; > > +my $last_line = ''; > > +while (<STDIN>) { > > + my $matched = 0; > > + chomp; > > + for my $regex (@regexes) { > > + next unless $_ =~ $regex; > > + if ($last_regex == $regex) { > > + die "duplicate lines: '$_'\n" unless $last_line ne $_; > > + die "unsorted lines: '$last_line' before '$_'\n" unless $last_line lt $_; > > + } > > + $matched = 1; > > + $last_regex = $regex; > > + $last_line = $_; > > + } > > + unless ($matched) { > > + $last_regex = 0; > > + $last_line = ''; > > + } > > +} > > This is, of course, endlessly bikesheddable. Here is a shorter -- and, > at least for me, easier to understand -- way to do it: > > my $rc = 0; > chomp(my @all = <STDIN>); > foreach my $needle (@ARGV) { > my @lines = grep(/^$needle/, @all); > if (join("\n", @lines) ne join("\n", sort @lines)) { > print "'$needle' lines not sorted\n"; > $rc = 1; > } > } > exit $rc; That's pretty clever, thanks for showing me how it's done :) However, the reason I wrote it out the way that I did is because my code ensures that consecutive lines matching the regex are sorted but if there are any breaks between matching regex lines, it will consider them separate blocks. Just taking all the lines fails in the case of `LIB_OBJS \+=` in Makefile since we have LIB_OBJS += zlib.o [... many intervening lines ...] LIB_OBJS += $(COMPAT_OBJS) and that is technically unsorted. That being said, I don't really like my current approach that much. I think I have two better options: 1. Tighten up the regexes so that it excludes the $(COMPAT_OBJS). I don't want to be too strict, though, because if we end up not matching a line it might end up unsorted. 2. Consider blank lines to be block separators and only consider it to be sorted if the text matching regexes within a block are sorted. Now that I've written that all out, I think I like option 1 more, although I could definitely be convinced to go either way. > By the way, it might be a good idea to also print the filename in > which the problem occurred. Such context can be important for the > person trying to track down the complaint. To do so, you'd probably > want to pass the filename as an argument, and open and read the file > rather than sending it only as standard-input. Agreed. Thanks, Denton