On Tue, Feb 22, 2011 at 11:33:19AM +0100, Matthieu Moy wrote: > So, nobody's more inspired to comment on this proposal? ;-) No, just busy. :) > > One solution is to use a git-based wiki, like ikiwiki [1], golum > > [2], ... but in many contexts, this is not applicable: because one > > has an existing wiki and doesn't want to migrate, because the > > git-based wiki lacks features XYZ that you rely on, or because one is > > a small contributor to a large project (say, wikipedia) and cannot > > force the project to change. > > > > I'm thinking of an alternative to this: implement a foreign VCS > > interface for a wiki engine. Today, one can interact with, say, SVN, > > using Git (via git-svn [3]). This way, we can get most of the Git > > goodness locally, and just "publish" the changes on an SVN repository. I think this a great idea. I made some first steps with the import I did here: https://github.com/peff/wikitest I'll include my quick-and-dirty mediawiki fast-exporter at the end of this mail. I'm sure it probably has some bugs or corner-cases it doesn't handle, but maybe it can help as a starting point. And then of course we'd need incremental fetching (which I planned to do by hitting the "recent changes" API in mediawiki). And some method for pushing changes back up, which I didn't even look at. > > * Ability to import only a subset of the wiki (nobody want to "git > > clone" the whole wikipedia ;-) ). At least a manually-specified list > > of pages, and better, the content of one category. Neat idea. One thing that might be useful for a site like wikipedia is "fetch this page, and any pages it links to, pages they link to, and so on, to a recursion depth of N". So if you are interested in some topic you could get related topics. But that's a lot harder, since it means fetching and parsing the mediawiki, whereas the rest can be done through the API. Anyway, here is my mediawiki fast-exporter. Like I said, quick and dirty. -- >8 -- #!/usr/bin/perl use strict; use MediaWiki::API; use DB_File; use Storable qw(freeze thaw); use DateTime::Format::ISO8601; use Encode qw(encode_utf8); my $url = shift; my $mw = MediaWiki::API->new; $mw->{config}->{api_url} = "$url/api.php"; my $pages = $mw->list({ action => 'query', list => 'allpages', aplimit => 500, }); # Keep everything in a db so we are restartable. my $revdb = tie my %revisions, 'DB_File', 'revisions.db'; print STDERR "Fetching revisions...\n"; my $n = 1; foreach my $page (@$pages) { my $id = $page->{pageid}; print STDERR "$n/", scalar(@$pages), ": $page->{title}\n"; $n++; next if exists $revisions{$id}; my $q = { action => 'query', prop => 'revisions', rvprop => 'content|timestamp|comment|user|ids', rvlimit => 10, pageids => $page->{pageid}, }; my $p; while (1) { my $r = $mw->api($q); # Write out all content to files. foreach my $rev (@{$r->{query}->{pages}->{$id}->{revisions}}) { my $fn = "$rev->{revid}.rev"; open(my $fh, '>', $fn) or die "unable to open $fn: $!"; binmode $fh, ':utf8'; print $fh $rev->{'*'}; close($fh); delete $rev->{'*'}; } # And then save the rest, appending if necessary. if (defined $p) { push @{$p->{revisions}}, @{$r->{query}->{pages}->{$id}->{revisions}}; } else { $p = $r->{query}->{pages}->{$id}; } # And continue or quit, depending on the output. last unless $r->{'query-continue'}; $q->{rvstartid} = $r->{'query-continue'}->{revisions}->{rvstartid}; } print STDERR " Fetched ", scalar(@{$p->{revisions}}), " revisions.\n"; $revisions{$id} = freeze($p); $revdb->sync; } # Make a flat list of all page revisions, so we can # interleave them in date order. my @revisions = map { my $page = thaw($revisions{$_}); my @revisions = @{$page->{revisions}}; delete $page->{revisions}; $_->{page} = $page foreach @revisions; @revisions } keys(%revisions); print STDERR "Writing export data...\n"; binmode STDOUT, ':binary'; $n = 1; foreach my $rev (sort { $a->{timestamp} cmp $b->{timestamp} } @revisions) { my $user = $rev->{user} || 'Anonymous'; my $dt = DateTime::Format::ISO8601->parse_datetime($rev->{timestamp}); my $fn = "$rev->{revid}.rev"; my $size = -s $fn; my $comment = defined $rev->{comment} ? $rev->{comment} : ''; my $title = $rev->{page}->{title}; $title =~ y/ /_/; print STDERR "$n/", scalar(@revisions), ": $rev->{page}->{title}\n"; $n++; print "commit refs/remotes/origin/master\n"; print "committer $user <none\@example.com> ", $dt->epoch, " +0000\n"; print "data ", bytes::length(encode_utf8($comment)), "\n", encode_utf8($comment); print "M 644 inline $title.wiki\n"; print "data $size\n"; open(my $fh, '<', $fn) or die "unable to open $fn: $!"; binmode $fh, ':binary'; while (read($fh, my $buf, 4096)) { print $buf; } } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html