Re: [PATCH/RFC v2 5/6] gettext: Add a Gettext interface for Perl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 1, 2010 at 17:00, Jakub Narebski <jnareb@xxxxxxxxx> wrote:
>
> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>
>> Make Git's gettext messages available to Perl programs through
>> Locale::Messages. Gracefully fall back to English on systems that
>> don't contain the module.
>
> This is I think a very good idea, both providing wrapper with fallback
> to untranslated messages (and encapsulating translation), and using the
> Locale::Messages module from libintl-perl library.  The "On the state of
> i18n in Perl" (http://rassie.org/archives/247) blog post from 26 April
> 2009 provides nice counterpoint to Locale::Maketext::TPJ13 / The Perl
> Journal #13 article[1] from 1999... especially because using gettext is
> natural for translating git command output and GUI, because git uses it
> already for Tcl/Tk (gitk and git-gui), and it is natural solution for
> code in C, which slightly less than half of git code.
>
> Well, we could use Locale::Maketext::Gettext, but it is not in Perl
> core either, and as http://rassie.org/archives/247 says its '*.po'
> files are less natural.  The gettext documentation (gettext.info) also
> recommends libintl-perl, or to be more exact Locale::TextDomain from
> it.
>
> [1] http://search.cpan.org/perldoc?Locale::Maketext::TPJ13
>    http://interglacial.com/~sburke/tpj/as_html/tpj13.html

Locale::Maketext is a less complete and non-standard alternative as
the above blogs note. I think the main reason it's used in favor of
Gettext in many Perl projects is that it can be compiled stand-alone
from the CPAN, i.e. it doesn't depend on libintl.

That's comfortable when CPAN is your main distribution channel, but
Git's main distribution channel is via OS packages, which can
trivially introduce a gettext dependency (for which they doubtless
already have a package).

Locale::Maketext has a Gettext emulation layer, but using it would be
a potential source of bugs (no emulation is perfect).

That being said the way I wrote Git::Gexttext means any alterante
implementation or emulation layer can be seemlessly added later on. We
could use (or write) something for Perl, C or Shell that completely
bypasses libintl for systems that don't have a gettext C library.

> The question is why not use Locale::TextDomain, the high-level Perl-y
> framework, wrapper around Locale::Messages from the same libintl-perl
> library?  The gettext documentation (in gettext.info, chapter "13.5.18
> Perl") says:
>
>  Prerequisite
>     `use POSIX;'
>     `use Locale::TextDomain;' (included in the package libintl-perl
>     which is available on the Comprehensive Perl Archive Network CPAN,
>     http://www.cpan.org/).
>
>
> This would change
>
>  -      print "Emails will be sent from: ", $sender, "\n";
>  +      printf gettext("Emails will be sent from: %s\n"), $sender;
>
> to either
>
>  +      print __"Emails will be sent from: ", $sender, "\n";
>
> or
>
>  +      printf __("Emails will be sent from: %s\n"), $sender;
>
> or
>
>  +      print __x("Emails will be sent from: {sender}\n",
>  +                sender => $sender);

I didn't use Locale::TextDomain because it exports a fatter interface
by default, and I'm not sure at this point what subset of it it would
make sense to support.

This is the default Locale::TexdDomain interface:

   @EXPORT = qw (__ __x __n __nx __xn __p __px __np __npx $__ %__
                 N__ N__n N__p N__np);

My Git::Gettext only exports a single gettext() function now. I think
it's better at this point to have a really small interface and decide
later if we'd like to expand it, preferably in a way that'll work
consistently across C, Perl and Shell programs.

We might want to improve plural support, add msgctxt etc. later. But
for an initial implementation I'd rather have something simple &
stupid.

>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
>> ---
>>  Makefile            |    4 ++-
>>  git-send-email.perl |    3 +-
>>  perl/Git/Gettext.pm |   83 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  perl/Makefile.PL    |    5 ++-
>>  4 files changed, 92 insertions(+), 3 deletions(-)
>>  create mode 100644 perl/Git/Gettext.pm
>>
>> diff --git a/Makefile b/Makefile
>> index dce2faa..2101713 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -1884,7 +1884,9 @@ cscope:
>>       $(FIND) . -name '*.[hcS]' -print | xargs cscope -b
>>
>>  pot:
>> -     $(XGETTEXT) -k_ -o po/git.pot $(C_OBJ:o=c) $(SCRIPT_SH)
>> +     $(XGETTEXT) --keyword=_ --output=po/git.pot --language=C $(C_OBJ:o=c)
>> +     $(XGETTEXT) --join-existing --output=po/git.pot --language=Shell $(SCRIPT_SH)
>
> Shouldn't this line be in earlier patch, i.e. in "gettext: Add a
> Gettext interface for shell scripts"?

It needed to be split up into seperate line in the Perl commit because ...

>> +     $(XGETTEXT) --join-existing --output=po/git.pot --language=Perl $(SCRIPT_PERL)
>
> From gettext documentation (in gettext.info, chapter "13.5.18 Perl"):
>
>  Extractor
>     `xgettext -k__ -k\$__ -k%__ -k__x -k__n:1,2 -k__nx:1,2 -k__xn:1,2
>     -kN__ -k'
>
> Is it equivalent to specifying 'xgettext --language=Perl'?

I'm doing --language=Perl because gettext doesn't recognize .perl as a
valid Perl extension. It only knows about `pl', `PL', `pm', `cgi'. (As
an aside, I haven't found why Git chose to use the quaint .perl
extension).

But yeah, using --language=Perl is overshooting it with regards to
keyword extraction. From "5.1 Invoking the `xgettext' Program":

     To disable the default keyword specifications, the option `-k' or
     `--keyword' or `--keyword=', without a KEYWORDSPEC, can be used.

I'll check if I can supply --keyword to all the XGETTEXT invocations.

> Of course the above assumes that you are using Locale::TextDomain, or
> at least use the same conventions.
>
>>
>>  POFILES := $(wildcard po/*.po)
>>  MOFILES := $(patsubst po/%.po,share/locale/%/LC_MESSAGES/git.mo,$(POFILES))
>> diff --git a/git-send-email.perl b/git-send-email.perl
>> index 111c981..a36718e 100755
>> --- a/git-send-email.perl
>> +++ b/git-send-email.perl
>> @@ -26,6 +26,7 @@ use Term::ANSIColor;
>>  use File::Temp qw/ tempdir tempfile /;
>>  use Error qw(:try);
>>  use Git;
>> +use Git::Gettext qw< :all >;
>
> Please follow the existing convention, i.e.
>
>  +use Git::Gettext qw(:all);
>
> like you see in context line for 'use Error'.
>
>
> Well, not that git-send-email.perl is very consistent about this
> issue, see 'use File::Temp qw/ tempdir tempfile /;' versus
> 'use Error qw(:try);' but I'd reather you didn't introduce yet
> another form.

OK. I'll change that. I usually just pick whatever Perl quote
delimiters that happen to, literally, be close at hand at the time :)

>>  Getopt::Long::Configure qw/ pass_through /;
>>
>> @@ -674,7 +675,7 @@ if (!defined $sender) {
>>       $sender = $repoauthor || $repocommitter || '';
>>       $sender = ask("Who should the emails appear to be from? [$sender] ",
>>                     default => $sender);
>> -     print "Emails will be sent from: ", $sender, "\n";
>> +     printf gettext("Emails will be sent from: %s\n"), $sender;
>
> As I wrote, this IMHO should be either
>
>  +     printf __("Emails will be sent from: %s\n"), $sender;
>
> or
>
>  +     print __x("Emails will be sent from: {sender}\n", sender => $sender);
>
> (note that parantheses differ in those two examples).

I'd prefer the former argument expansion form (i.e. %s, not
{$sender}), just so translators don't have to deal with two different
argument forms.

As for __ or gettext I don't mind. I just picked the former on a whim
to match the Shell version. Maybe I should just change C|Perl|Shell to
use _?

And we can use _, __ is just used to disambiguate it from the _
filehandle, and Perl's pretty good at disambiguating that already:

    $ perl -E 'sub _ ($) { scalar reverse $_[0] }; say _("moo"); stat
"/etc/hosts"; say ((stat(_))[7])'
    oom
    168

>>       $prompting++;
>>  }
>>
>> diff --git a/perl/Git/Gettext.pm b/perl/Git/Gettext.pm
>> new file mode 100644
>> index 0000000..f434783
>> --- /dev/null
>> +++ b/perl/Git/Gettext.pm
>> @@ -0,0 +1,83 @@
>> +package Git::Gettext;
>
> Should this package be named Git::Gettext, or other name would be
> better, perhaps Git::I18N (like e.g. Games::Risk have
> Games::Risk::I18N), or Git::Locale, or even Git::Translator?
>
> Not very important.

I really have no opinion on that, but it does seem like a lot of Perl
packages on CPAN use the ::I18N suffix.

>> +use strict;
>> +use warnings;
>> +use Exporter;
>> +use base 'Exporter';
>
> O.K.
>
> The alternative would be to use
>
>  +use Exporter qw(import);

Not if we want to maintain 5.6 support. The `use Exporter "import"'
form was only introduced in 5.8. I use it everywhere where I don't
have to care about 5.6 support (which is everywhere but Git).

>> +
>> +our $VERSION = '0.01';
>> +
>> +our @EXPORT;
>> +our @EXPORT_OK = qw< gettext >;
>
> Same comment as above: more common way is to use qw(gettext) not
> qw< gettext >; it is also the notation used in Exporter documentation.

Will do.

>> +our %EXPORT_TAGS;
>> +@{ $EXPORT_TAGS{'all'} } = @EXPORT_OK;
>
> Why not simply
>
>  +our %EXPORT_TAGS = ('all' => [ @EXPORT_OK ]);
>
> or the reverse
>
>  +our %EXPORT_TAGS = ('all' => [qw(gettext)]);
>  +our @EXPORT_OK   = @{$EXPORT_TAGS{'all'}};
>
> or the reverse using tag utility functions
>
>  +our %EXPORT_TAGS = ('all' => [qw(gettext)]);
>  +Exporter::export_ok_tags('all');

Just because it some old code of mine I had around that I copied
from. I could rewrite it to a nicer form.

Actually I was considering changing it to just have @EXPORT and
nothing else. You're going to use gettext if you use the medule
anyway, so we might as well just import them all everywhere we use
Gettext in Perl.

>> +
>> +sub __bootstrap_locale_messages {
>> +     our $TEXTDOMAIN = 'git';
>> +
>> +     # TODO: How do I make the sed replacements in the top level
>> +     # Makefile reach me here?
>> +     #our $TEXTDOMAINDIR = q|@@LOCALEDIR@@|;
>
> In Perl (well, in gitweb/gitweb.perl) we use '++VAR++' and not
> '@@VAR@@' for placeholders, because '@' is sigil in Perl.  This is not
> important in above example, because it is not interpolated string.

I'll change it to follow that convention.

> Make invoked on perl/Makefile, when invoked from main Makefile by
> '$(MAKE) -C perl' (via QUIET_SUBDIR0) passes 'localedir' to submake;
> perl/Makefile should probably have something like
>
>  localedir ?= $(sharedir)/locale
>
> That is assuming that 'localedir' is added to list of exported
> variables.
>
> But I am not sure how such substitution should be performed.

I'll try to get something like that to work. Actually the main problem
seemed to be that it had a hybrid handwritten Makefile and one made by
EU::MM.

>> +     our $TEXTDOMAINDIR = q</usr/local/share/locale>;
>
> Why q<...> and not simply '...'?

Same reason I use qw<>, that is no particular reason other than it
being easier to type on my keyboard. I'll just change it to ''.

>> +     require POSIX;
>> +     POSIX->import(qw< setlocale >);
>> +     # Non-core prerequisite module
>> +     require Locale::Messages;
>> +     Locale::Messages->import(qw< :locale_h :libintl_h >);
>> +
>> +     setlocale(LC_MESSAGES(), '');
>> +     setlocale(LC_CTYPE(), '');
>> +     textdomain($TEXTDOMAIN);
>> +     bindtextdomain($TEXTDOMAIN => $TEXTDOMAINDIR);
>> +
>> +     return;
>> +}
>
> This would probably be a bit simpler with Locale::TextDomain.
>
>> +
>> +BEGIN
>> +{
>> +     local ($@, $!);
>> +     eval { __bootstrap_locale_messages() };
>> +     if ($@) {
>> +             # Oh noes, no Locale::Messages here
>> +             *gettext = sub ($) { $_[0] };
>
> Do you intended to use subroutine protytype here?  Ah, I see that
> Locale::Messages::gettext has the same prototype...

Yeah, otherwise the fallback would have different semantics.

>> +     }
>> +}
>
> Does it need to be in BEGIN block?  Probably yes.

I'm pretty sure it needs to all be at BEGIN time. I recall that
declaring prototypes at runtime was an error in some older versions,
but it works now:

    $ perl -E '*meh = sub ($) { "foo" }; say prototype "meh"'
    $

Anyway since it's code that's being use'd it's going to always be at
BEGIN time anyway. I just wanted to be explicit.

>> +
>> +1;
>> +
>> +__END__
>> +
>
> It's nice that you have provided documentation.
>
>> +=head1 NAME
>> +
>> +Git::Gettext - Perl interface to Git's Gettext localizations
>> +
>> +=head1 DESCRIPTION
>> +
>> +Git's internal interface to Gettext via L<Locale::Messages>. If
>> +L<Locale::Messages> can't be loaded (it's not a core module) we
>> +provide stub passthrough fallbacks.
>
> Very good.
>
> It would probably be better though to use L<Locale::TextDomain>
> instead of low(er)-level L<Locale::Messages>.
>
>> +
>> +=head1 FUNCTIONS
>> +
>> +=head2 gettext($)
>> +
>> +L<Locale::Messages>'s gettext function if all goes well, otherwise our
>> +passthrough fallback function.
>
> Other packages use _T() function for that, or like Locale::TextDomain
> __() function.
>
>> +
>> +=head1 EXPORTS
>> +
>> +Exports are done via L<Exporter>. Invididual functions can be
>> +exporter, or all of them via the C<:all> export tag.
>
> Shouldn't this be described in less technical way in SYNOPSIS and
> DESCRIPTION sections instead?

Yeah, maybe. But there's a convention for `=head1 EXPORTS' in perl
core / CPAN. So that's what I added at a whim.

>> +
>> +=head1 AUTHOR
>> +
>> +E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason <avarab@xxxxxxxxx>
>> +
>> +=head1 LICENSE AND COPYRIGHT
>> +
>> +Copyright 2010 E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason <avarab@xxxxxxxxx>
>> +
>> +This program is free software, you can redistribute it and/or modify
>> +it under the same terms as Perl itself.
>
> Which is dual licensed: GPL + Perl artistic.
>
> Was it intended to use 'same terms as Perl itself' rather than GPLv2
> or GPLv2+?

It's the permissive boilerplate I auto-insert for Perl code. I'll just
remove it. The COPYING file at the top level will suffice.

>> +
>> +=cut
>> diff --git a/perl/Makefile.PL b/perl/Makefile.PL
>> index 0b9deca..702ec7c 100644
>> --- a/perl/Makefile.PL
>> +++ b/perl/Makefile.PL
>> @@ -16,7 +16,10 @@ endif
>>  MAKE_FRAG
>>  }
>>
>> -my %pm = ('Git.pm' => '$(INST_LIBDIR)/Git.pm');
>> +my %pm = (
>> +     'Git.pm' => '$(INST_LIBDIR)/Git.pm',
>> +     'Git/Gettext.pm' => '$(INST_LIBDIR)/Git/Gettext.pm',
>> +);
>>
>>  # We come with our own bundled Error.pm. It's not in the set of default
>>  # Perl modules so install it if it's not available on the system yet.
>> --
>> 1.7.1.248.gcd6d1
>>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]