In the past we have used mw-render to convert wiki pages to XML. It isn't 100%, but it can be a huge time saver during the initial conversion from the wiki. mw-render is part of the python-mwlib package. I put together some scripts to read a list of beats and convert them, and also a sed script to clean up some of the more obvious hiccups from using mw-render. mw-render has been kind of up and down. At least as of F18 it was working. I haven't tried it since. I used the following script to read a list of beat names and convert them to XML. Although the conversion isn't 100%, it is way better than cut and paste when you have a number of beats to do. For later cleanup, cut and paste is better: #!/bin/sh MWR="mw-render_out" XML="XML_Files" rm -Rf ${XML} ${MWR} mkdir ${XML} mkdir ${MWR} for i in `cat ./WikiList`; do BEAT=Documentation_${i}_Beat echo ====== $BEAT ======= /usr/bin/mw-render -c http://fedoraproject.org/w/ \ -w docbook $BEAT -o ${MWR}/${i}.inter; sed -f sedscr ${MWR}/${i}.inter >${XML}/${i}.xml done My clumsy sed script is: s/<sectioninfo>// s?</sectioninfo>?? s/<para>/\n<para>\n/ s?</para>?\n</para>? s/></>\n</ s?<emphasis>?<package>? s?</emphasis>?</package>? s? </title?</title? s?<itemizedlist?\n<itemizedlist? s/<book>// s?</book>?? s?<articleinfo>?? s?</articleinfo>?? s?<article lang="en">?? s?</article>?? Mostly it gets rid of things, and also puts some key tags at the beginning of the line for easier editing and formatting. I generally use emacs to reformat the result so it is readable. The WikiList is simply a list of beat names like: Printing Desktop Productivity Networking Not perfect, but a good time saver. --McD -- docs mailing list docs@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/docs