I have done a workaround hack, today I attempted to hack the blame code but I need to do more research, it did not work. But I did get a new version of the import script running and word level blame going. http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.html Next step is ready : 1. I have a single script that will pull a given article and check in the revisions into git, it is not perfect, but works. http://bazaar.launchpad.net/~jamesmikedupont/+junk/wikiatransfer/revision/8 you run it like this,from inside a git repo : perl GetRevisions.pl "Article_Name" git blame Article_Name/Article.xml git push origin master The code that splits up the line is in Process File, this splits all spaces into newlines. that way we get a word level blame. if ($insidetext) { ## split all lines on the space s/(\ )/\\\n/g; print OUT $_; } The Article is here: http://github.com/h4ck3rm1k3/KosovoWikipedia/blob/master/Wiki/2008_Kosovo_declaration_of_independence/article.xml here are the blame results. http://github.com/h4ck3rm1k3/KosovoWikipedia/blob/master/Wiki/2008_Kosovo_declaration_of_independence/wordblame.txt Problem is that github does not like this amount of processor power begin used and kills the process, you can do a local git blame. Now we have the tool to easily create a repository from wikipedia, or any other export enabled mediawiki. mike On Sat, Oct 17, 2009 at 8:50 AM, jamesmikedupont@xxxxxxxxxxxxxx <jamesmikedupont@xxxxxxxxxxxxxx> wrote: > Thank you very much for your input and advice, > I have a lot of learn about this great tool. > I am working on learning how the existing blame tool runs now. > Will report back when I have some code. > mike > > On Sat, Oct 17, 2009 at 1:25 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: >> "jamesmikedupont@xxxxxxxxxxxxxx" <jamesmikedupont@xxxxxxxxxxxxxx> writes: >> >>> What do you think of my idea to create blames along a specific user >>> defined byte positions ? >> >> Overly complicated and not enough time for _review_. If you are blaming >> one-byte (or one-char) per line, wouldn't it be enough to consider the >> line number in the output as byte (or char) position when reconstituting >> the original text? >> > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html