Re: [PATCH 5/7] Makefile: add 'check-sort' target

Denton Liu <liu.denton@xxxxxxxxx> · Wed, 17 Mar 2021 02:50:09 -0700

Hi Eric,

On Tue, Mar 16, 2021 at 02:37:16AM -0400, Eric Sunshine wrote:
> On Mon, Mar 15, 2021 at 8:57 PM Denton Liu <liu.denton@xxxxxxxxx> wrote:
> > In the previous few commits, we sorted many lists into ASCII-order. In
> > order to ensure that they remain that way, add the 'check-sort' target.
> > [...]
> > Signed-off-by: Denton Liu <liu.denton@xxxxxxxxx>
> > ---
> > +my @regexes = map { qr/^$_/ } @ARGV;
> > +my $last_regex = 0;
> > +my $last_line = '';
> > +while (<STDIN>) {
> > +       my $matched = 0;
> > +       chomp;
> > +       for my $regex (@regexes) {
> > +               next unless $_ =~ $regex;
> > +               if ($last_regex == $regex) {
> > +                       die "duplicate lines: '$_'\n" unless $last_line ne $_;
> > +                       die "unsorted lines: '$last_line' before '$_'\n" unless $last_line lt $_;
> > +               }
> > +               $matched = 1;
> > +               $last_regex = $regex;
> > +               $last_line = $_;
> > +       }
> > +       unless ($matched) {
> > +               $last_regex = 0;
> > +               $last_line = '';
> > +       }
> > +}
> 
> This is, of course, endlessly bikesheddable. Here is a shorter -- and,
> at least for me, easier to understand -- way to do it:
> 
>     my $rc = 0;
>     chomp(my @all = <STDIN>);
>     foreach my $needle (@ARGV) {
>         my @lines = grep(/^$needle/, @all);
>         if (join("\n", @lines) ne join("\n", sort @lines)) {
>             print "'$needle' lines not sorted\n";
>             $rc = 1;
>         }
>     }
>     exit $rc;

That's pretty clever, thanks for showing me how it's done :)

However, the reason I wrote it out the way that I did is because my code
ensures that consecutive lines matching the regex are sorted but if
there are any breaks between matching regex lines, it will consider them
separate blocks. Just taking all the lines fails in the case of
`LIB_OBJS \+=` in Makefile since we have

	LIB_OBJS += zlib.o

	[... many intervening lines ...]

	LIB_OBJS += $(COMPAT_OBJS)

and that is technically unsorted. That being said, I don't really like
my current approach that much.

I think I have two better options:

	1. Tighten up the regexes so that it excludes the
	   $(COMPAT_OBJS). I don't want to be too strict, though,
	   because if we end up not matching a line it might end up
	   unsorted.

	2. Consider blank lines to be block separators and only consider
	   it to be sorted if the text matching regexes within a block
	   are sorted.

Now that I've written that all out, I think I like option 1 more,
although I could definitely be convinced to go either way.

> By the way, it might be a good idea to also print the filename in
> which the problem occurred. Such context can be important for the
> person trying to track down the complaint. To do so, you'd probably
> want to pass the filename as an argument, and open and read the file
> rather than sending it only as standard-input.

Agreed.

Thanks,
Denton