On Sun, Oct 29, 2017 at 10:51 PM, Antoine Beaupré <anarcat@xxxxxxxxxx> wrote: > When we specify a list of namespaces to fetch from, by default the MW > API will not fetch from the default namespace, refered to as "(Main)" > in the documentation: > > https://www.mediawiki.org/wiki/Manual:Namespace#Built-in_namespaces > > I haven't found a way to address that "(Main)" namespace when getting > the namespace ids: indeed, when listing namespaces, there is no > "canonical" field for the main namespace, although there is a "*" > field that is set to "" (empty). So in theory, we could specify the > empty namespace to get the main namespace, but that would make > specifying namespaces harder for the user: we would need to teach > users about the "empty" default namespace. It would also make the code > more complicated: we'd need to parse quotes in the configuration. > > So we simply override the query here and allow the user to specify > "(Main)" since that is the publicly documented name. Thanks, this explanation makes the patch a lot clearer. More below... > Signed-off-by: Antoine Beaupré <anarcat@xxxxxxxxxx> > --- > diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl > @@ -264,9 +264,14 @@ sub get_mw_tracked_categories { > sub get_mw_tracked_namespaces { > my $pages = shift; > foreach my $local_namespace (@tracked_namespaces) { > - my $namespace_id = get_mw_namespace_id($local_namespace); > + my ($namespace_id, $mw_pages); > + if ($local_namespace eq "(Main)") { > + $namespace_id = 0; > + } else { > + $namespace_id = get_mw_namespace_id($local_namespace); > + } I meant to ask this in the previous round, but with the earlier patch mixing several distinct changes into one, I plumb forgot: Would it make sense to move this "(Main)" special case into get_mw_namespace_id() itself? After all, that function is all about determining an ID associated with a name, and "(Main)" is a name. > next if $namespace_id < 0; # virtual namespaces don't support allpages > - my $mw_pages = $mediawiki->list( { > + $mw_pages = $mediawiki->list( { Why did the "my" of $my_pages get moved up to the top of the foreach loop? I can't seem to see any reason for it. Is this an unrelated change accidentally included in this patch? > action => 'query', > list => 'allpages', > apnamespace => $namespace_id, > --