Re: [PATCH v2 00/10] send-email: various optimizations to speed up by >2x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 21, 2021 at 08:23:15AM +0200, Ævar Arnfjörð Bjarmason wrote:

> The "performant" really doesn't matter here, we're comparing shelling
> out to getting a small number of config keys back. So I wasn't trying to
> optimize this.
> 
> Returning a flattened list is idiomatic in Perl, it means that a caller
> can do any of:
> 
>     # I only care about the last value for a key, or only about
>     # existence checks
>     my %hash = func();
> 
> Or:
> 
>     # I want all key-values to iterate over
>     my @kv = func();
> 
> Returning touples like this makes that less convenient for both, who'll
> need to do more work to unpack them.
> 
> For what it's worth In Perl "return a list" means "flattened list", the
> term "flattened list" I think comes from other languages. You'd call
> what you're suggesting a list of arrays, or (if a top-level reference)
> an array of arrays, AoA for short, AoH for array (ref) of hashes etc.

Yeah, I think that is reasonable. But it made me wonder how we handle
value-less booleans, and I think there's indeed a bug.

Try a config like this:

  $ cat >foo <<\EOF
  [foo]
  key-with-value = string
  key-with-empty =
  just-bool
  another-key-with-value = another
  EOF

A regular config --list looks like:

  $ git config --file=foo --list
  foo.key-with-value=string
  foo.key-with-empty=
  foo.just-bool
  foo.another-key-with-value=another

Note how "just-bool" drops the "=" to distinguish it from the empty
string. With "-z", it looks like this:

  $ git config --file=foo --list -z
  foo.key-with-value
  string^@foo.key-with-empty
  ^@foo.just-bool^@foo.another-key-with-value
  another^@

The NULs separate keys, but keys are separated from their values by a
newline. And again, just-bool omits the newline.

Your parser splits on newline, so for that entry we'll get only one
string returned in the list (the key), rather than two (the key and
value). In a flattened list, that becomes ambiguous. E.g., adapting your
parser into a stand-alone script:

  $ git config --file=foo --list -z |
    perl -e '
      local $/;
      my $data = <STDIN>;
      my (@kv) = map { split /\n/, $_, 2 } split /\0/, $data;

      while (@kv) {
        my $k = shift @kv;
	print "key: $k\n";
	my $v = shift @kv;
	print "  value: ", (defined $v ? $v : "undef"), "\n";
      }
    '
  key: foo.key-with-value
    value: string
  key: foo.key-with-empty
    value: 
  key: foo.just-bool
    value: foo.another-key-with-value
  key: another
    value: undef

We end up misinterpreting a key as a value, and vice versa.

Using a non-flattened structure would have prevented this (we'd sensibly
get undef when trying to access the missing second element of the
array). But I do agree the flattened structure is more perl-ish.
Probably you'd want to insert an explicit "undef" into the list. The
most perl-ish I could come up with is:

  my (@kv) = map { my ($k, $v) = split /\n/, $_, 2;
                   ($k, $v)
                 } split /\0/, $data;

I notice that $known_keys then becomes a non-flat representation. You'd
either want to turn that back into a zero-length array there, or store
the "undef" and handle it appropriately (it can be a synonym for "true",
though that is just an optimization at this point).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux