On Sat, Apr 17, 2010 at 03:10:56AM +0200, Richard Hartmann wrote: > On Sat, Apr 17, 2010 at 02:19, Sverre Rabbelier <srabbelier@xxxxxxxxx> wrote: > > > Assuming you do the import incrementally > > using something like git-fast-import (feeding it with a custom > > exporter that uses the dump as it's input) you shouldn't even need an > > extraordinary machine to do it (although you'd need a lot of storage). > > I am using a Python script [1] to import the XML dump. There is also a version available at (plug): git://github.com/sbober/levitation-perl.git That is a bit faster and consumes less memory (and is written in Perl). But that, too, will not be able to handle enwiki at the moment. > > > > Speaking of which, it might make sense to separate the > > worktree by prefix, so articles starting with "aa" go under the "aa" > > directory, etc? > > Very good idea. What command would I need to send to > git-fast-import to do that? levitation does that already. > > > Hope that helps, and if you do convert it (and it turns out to be > > usable, and you decide to keep it up to date somehow), put it up > > somewhere! :) > > It did. > I will make it available if it turns out to be useful. Keeping it up to > date might be harder unless they keep on releasing new > (incremental) snapshots. If desired, I could produce input files for git-fast-import for a larger wiki (like german or japanese wikipedia), so that other people might have a look at the performance. bye, Sebastian -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html