Re: [PATCH v2 00/10] send-email: various optimizations to speed up by >2x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, May 28 2021, Felipe Contreras wrote:
> 
> > Ævar Arnfjörð Bjarmason wrote:
> >> Returning a flattened list is idiomatic in Perl, it means that a caller
> >> can do any of:
> >> 
> >>     # I only care about the last value for a key, or only about
> >>     # existence checks
> >>     my %hash = func();
> >
> > I was staying on the sideline because I don't know what's idiomatic in
> > Perl, but Perl and Ruby share a lot in common (one could say Perl is the
> > grandfather of Ruby), and I do know very well what's idiomatic in Ruby.
> >
> > In perl you can do $ENV{'USER'}, and:
> >
> >   while (my ($k, $v) = each %ENV) {
> >     print "$k = $v\n";
> >   }
> >
> > Obviously it's idiomatic to use hashes this way [1].
> 
> For what it's worth idiomatic/good idea and "has an example in the perl
> documentation" unfortunately aren't always aligned. A lot of experienced
> Perl programmers avoid each() like the plague:
> http://blogs.perl.org/users/rurban/2014/04/do-not-use-each.html

Perl is an old language, and each() was introduced in 2010, it's
expected that some old-timers would not adapt to the new idioms.

BTW, Ruby borrowed a lot from Perl, but I'm pretty sure Perl borrowed
each() from Ruby.

Untilmately it doesn't matter what you use to traverse %ENV, my point is
that it's a hash.

> > It was a waste for Git::config_regexp to not do the sensible thing here.
> 
> FWIW we're commenting on a v2 of a series that's at v5 now, and doesn't
> use config_regexp() at all, the relevant code is inlined in
> git-send-email.perl now.

I know, I've been following the threads. I'm trying to say it's a shame
Git::config_regexp does not do the sensible thing.

> > You can do exactly the same in Ruby: ENV['USER']
> >
> >   ENV.each { |k, v| print "#{k} = #{v}\n" }
> >
> > And the way I would parse these configurations in Ruby is something like:
> 
> >   c = `git config -l -z`.split("\0").map { |e| e.split("\n") }.to_h
> >   c['sendemail.smtpserver']
> >
> > And this just gave me an idea...
> 
> I'd probably do it that way in Ruby, but not in Perl.
> 
> Things that superficially look the same in two languages can have
> completely different behaviors, a "hash" isn't a single type of data
> structure in these programming languages.
> 
> In particular Ruby doesn't have hshes in the Perl sense of the word, it
> has an ordered key-value pair structure (IIRC under the hood they're
> hashes + a double linked list).
> 
> Thus you can use it for things like parsing a key=>value text file where
> the key is unique and the order is important.
> 
> In Perl hashes are only meant for key-value lookup, they are not
> ordered, and are actually actively randomly ordered for security
> reasons. In any modern version inserting a new key will have an avalance
> effect of completely changing the order. It's not even stable across
> invocations:
>     
>     $ perl -wE 'my %h; for ("a".."z") { $h{$_} = $_; say keys %h }'
>     a
>     ab
>     bca
>     dcba
>     daebc
>     cbaedf
>     aecbfdg
>     dgfcbaeh
>     [...]

This used to be the case in Ruby too. The order of hashes was not
guaranteed.

The situation is more complicated because not only do you have different
versions, but you have different implementations. AFAIK the Ruby
language specification doesn't say anything about ordering, although
basically all implementations do order.

> The other important distinction (but I'm not so sure about Ruby here) is
> that Perl doesn't have any way to pass a hash or any other structure to
> another function, everything is flattened and pushed onto the stack.
> 
> To pass a "hash" you're not passing the hash, but a "flattened" pointer
> to it on the stack.
> 
> Thus passing and making use of these flattened values is idiomatic in
> Perl in a way that doesn't exist in a lot of other languages. In some
> other languages a function has to choose whether it's returning an array
> or a hash, in Perl you can just push the "flattened" items that make up
> the array on the stack, and have the caller decide if they're pushing
> those stack items into an array, or to a hash if they expect it to be
> meaningful as key-value pairs.

Yeah, that's something that wasn't borrowed. In Ruby everything is an
object.

> In the context of Git's config format doing that is the perfect fit for
> config values, our config values *are* ordered, but they are also
> sort-of hashes, but whether it's "all values" or "last value wins" (or
> anything else, that's just the common ones) depends on the key/user.
> 
> So by having a list of key-value pairs on the stack you can choose to
> put it into an array if you don't want to lose information, or put it
> into a hash if all you care about is "last key wins", or "I'm going to
> check for key existence".
> 
> I think that in many other languages that wouldn't make any sense, and
> you'd always return a structure like:
> 
>     [
>          key => [zero or more values],
>         [...]
>     ]
> 
> Or whatever, the caller can also unambiguously interpret those, but
> unlike Perl you'd need to write something to explicitly iterate the
> returned value (or a helper) to get it into a hash or a "flattened"
> array. In Perl it's trivial due to the "everything is on the stack"
> semantics.

Indeed, but my point is that it's a hash for all intents and purposes:

  my %hash = func();

And it makes sense for it to be a hash, just like %ENV.

And although the internals are different something very close would
happen in Ruby:

  hash = func().to_h

> Anyway, all that being said the part we're talking about as a trivial
> part of this larger series. I'd much prefer to have it just land as
> "good enough" at this point. It works, we can always tweak it further
> later if there's a need to do that.

Indeed, as I said, the entire patch series looks good to me.

Cheers.

-- 
Felipe Contreras



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux