On Fri, May 21 2021, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > >> I like all of this, except for the change in the interface of >> Git::config_regexp(). You mention that it's new-ish, and probably not in >> wide use. And I agree that's probably true. But it feels like violating >> a principle of not breaking APIs, and we should stick to that principle >> and not bend it for "well, it's not that old an API". >> >> I'd find it more compelling if it the existing interface was broken or >> hard to avoid changing. But couldn't we just add a new function with the >> extra info (config_regexp_with_values() or something)? > > We seem to use it in-tree only once, but we have no control over > what out-of-tree users have been doing. I thought people might have bought my argument in v1 that we just document these as "public" as part of some copy/pasting in Git.pm, but sure, I'll re-roll and make this a new function or something... > I do like your proposed name for the fact that it has _with_values > in it; the original gave us only a list of configuration variable > names, and now we are getting name-value tuples. > > BUT > > I am not sure if I like the new interface, and I do not know if the > the name "config_with_values" is a good match for what it does. > > Namely, when a function is described as: > > =item config_regexp ( RE ) > > Retrieve the list of configuration key names matching the regular > expression C<RE>. The return value is an ARRAY of key-value pairs. > > I would expect it to return ([key1, value1], [key2, value2], ...), > but the implementation returns (key1, value1, key2, value2, ...), > i.e. a flattened list, if I am not mistaken. ... > my @cmd = ('config', '--null', '--get-regexp', $regex); > unshift @cmd, $self if $self; > my $data = command(@cmd); > my (@kv) = map { split /\n/, $_, 2 } split /\0/, $data; > return @kv; > > We get NUL terminated records, so we first split the entire output > at NULs (to get a list of key-value records). Each key-value record > has the key followed by LF followed by value, so we split it at the > first LF (i.e. a value with an embedded LF would behave correctly) > to extract key and value out of it. But the resulting list is a > flat list with alternating key and value. > > The side that works on the returned value of courese knows that it > is getting a flattened list and acts accordingly: > > my @kv = Git::config_regexp("^sende?mail[.]"); > while (my ($k, $v) = splice @kv, 0, 2) { > push @{$known_config_keys{$k}} => $v; > } > > Perhaps it may be more performant than a more obvious and > straight-forward "list of tuples", i.e. > > return map { [split /\n/, $_, 2] } split /\0/, $data; > > my @kv = Git::config_regexp("^sende?mail[.]"); > for my $tuple (@kv) { > push @{$known_config_keys{$tuple->[0]}}, $tuple->[1]; > } > > but somehow the flattened list felt unnatural at least to a naïve > reader like me. The "performant" really doesn't matter here, we're comparing shelling out to getting a small number of config keys back. So I wasn't trying to optimize this. Returning a flattened list is idiomatic in Perl, it means that a caller can do any of: # I only care about the last value for a key, or only about # existence checks my %hash = func(); Or: # I want all key-values to iterate over my @kv = func(); Returning touples like this makes that less convenient for both, who'll need to do more work to unpack them. For what it's worth In Perl "return a list" means "flattened list", the term "flattened list" I think comes from other languages. You'd call what you're suggesting a list of arrays, or (if a top-level reference) an array of arrays, AoA for short, AoH for array (ref) of hashes etc.