Re: [PATCH 4/4] remote-mediawiki: allow using (Main) as a namespace and skip special namespaces

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2017-10-29 15:49:28, Eric Sunshine wrote:
> On Sun, Oct 29, 2017 at 12:08 PM, Antoine Beaupré <anarcat@xxxxxxxxxx> wrote:
>> Subject: remote-mediawiki: allow using (Main) as a namespace and skip special namespaces
>
> This patch is more difficult to review than it perhaps ought to be
> since it is making multiple unrelated changes.
>
> It's not clear from the description what special namespaces are and
> why they need to be skipped. It's also not clear why (Main) is
> special. Perhaps the commit message(s) could explain these issues in
> more detail.
>
> To simplify review and make it easier to gauge what it going on, it
> might make sense to split this patch into at least two: one which
> skips "special namespaces", and one which gives special treatment to
> (Main).

Agreed, I'll try to do that.

> More below...
>
>> Reviewed-by: Antoine Beaupré <anarcat@xxxxxxxxxx>
>> Signed-off-by: Antoine Beaupré <anarcat@xxxxxxxxxx>
>> ---
>> diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl
>> @@ -264,16 +264,27 @@ sub get_mw_tracked_categories {
>>  sub get_mw_tracked_namespaces {
>>      my $pages = shift;
>> -    foreach my $local_namespace (@tracked_namespaces) {
>> -        my $mw_pages = $mediawiki->list( {
>> -            action => 'query',
>> -            list => 'allpages',
>> -            apnamespace => get_mw_namespace_id($local_namespace),
>> -            aplimit => 'max' } )
>> -            || die $mediawiki->{error}->{code} . ': '
>> -                . $mediawiki->{error}->{details} . "\n";
>> -        foreach my $page (@{$mw_pages}) {
>> -            $pages->{$page->{title}} = $page;
>> +    foreach my $local_namespace (sort @tracked_namespaces) {
>> +        my ($mw_pages, $namespace_id);
>> +        if ($local_namespace eq "(Main)") {
>> +            $namespace_id = 0;
>> +        } else {
>> +            $namespace_id = get_mw_namespace_id($local_namespace);
>> +        }
>> +        if ($namespace_id >= 0) {
>
> This may be problematic since get_mw_namespace_id() may return undef
> rather than a number, in which case Perl will complain. Since the code
> skips the $mediawiki query altogether when it encounters "(Main)", you
> could fix this problem and simplify the code overall by simply
> skipping the bulk of the foreach loop body instead of mucking around
> with $namespace_id. For instance:
>
>     foreach my $local_namespace (sort @tracked_namespaces) {
>         next if ($local_namespace eq "(Main)");
>         ...normal processing...
>     }

Ah yes. I see your point but it doesn't actually skip the query when it
encouters main ($namespace_id >= 0).

>> +            if ($mw_pages = $mediawiki->list( {
>> +                action => 'query',
>> +                list => 'allpages',
>> +                apnamespace => $namespace_id,
>> +                aplimit => 'max' } )) {
>> +                print {*STDERR} "$#{$mw_pages} found in namespace $local_namespace ($namespace_id)\n";
>
> The original code did not emit this diagnostic but the new code does
> so unconditionally. Is this just leftover debugging code or is
> intended that all users should see this information all the time?

This is a known issue that permeates the whole remote at this point, and
it is quite annoying.

https://github.com/Git-Mediawiki/Git-Mediawiki/issues/30

I have, however, considered it useful to include this to show progress
as it can take a while to fetch all namespace information...

Obviously, once we figure out how to silence this stuff (ie. how to
recognize -q), it should be silenced like everything else, but until
then I think it's quite useful.

>> +                foreach my $page (@{$mw_pages}) {
>> +                    $pages->{$page->{title}} = $page;
>> +                }
>> +            } else {
>> +                warn $mediawiki->{error}->{code} . ': '
>> +                    . $mediawiki->{error}->{details} . "\n";
>
> I guess this is the part which "skips special namespaces". The
> original code die()'d but this merely warns. Aside from these "special
> namespaces", are there genuine cases when the $mediawiki query would
> return an error, and which should indeed die(), or is warning
> appropriate for all $mediawiki query error cases?

Maybe I didn't get the indentation right, but this } else { is for query
failures, *not* the if ($namespace_id < 0). So < 0 is just silently
skipped.

The original code was die()'ing on failures, but I think that's a
mistake: we should fetch what we can and warn on the failures. That
allows the user to fix multiple problems at once instead of having to
rerun the script repeatedly.

A.

-- 
Le féminisme n'a jamais tué personne
Le machisme tue tous les jours.
                        - Benoîte Groulx



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux