Re: Introduction and Wikipedia and Git Blame

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have done a workaround hack,
today I attempted to hack the blame code but I need to do more
research, it did not work.

But I did get a new version of the import script running and word
level blame going.

http://fmtyewtk.blogspot.com/2009/10/mediawiki-git-word-level-blaming-one.html

Next step is ready :

1. I have a single script that will pull a given article and check in
the revisions into git,
it is not perfect, but works.

http://bazaar.launchpad.net/~jamesmikedupont/+junk/wikiatransfer/revision/8
you run it like this,from inside a git repo :

perl GetRevisions.pl "Article_Name"

git blame Article_Name/Article.xml
git push origin master

The code that splits up the line is in Process File, this splits all
spaces into newlines.
that way we get a word level blame.

     if ($insidetext)
     {
  ## split all lines on the space
  s/(\ )/\\\n/g;


  print OUT  $_;
     }


The Article is here:
http://github.com/h4ck3rm1k3/KosovoWikipedia/blob/master/Wiki/2008_Kosovo_declaration_of_independence/article.xml


here are the blame results.
http://github.com/h4ck3rm1k3/KosovoWikipedia/blob/master/Wiki/2008_Kosovo_declaration_of_independence/wordblame.txt


Problem is that github does not like this amount of processor power
begin used and kills the process, you can do a local git blame.

Now we have the tool to easily create a repository from wikipedia, or
any other export enabled mediawiki.

mike


On Sat, Oct 17, 2009 at 8:50 AM, jamesmikedupont@xxxxxxxxxxxxxx
<jamesmikedupont@xxxxxxxxxxxxxx> wrote:
> Thank you very much for your input and advice,
> I have a lot of learn about this great tool.
> I am working on learning how the existing blame tool runs now.
> Will report back when I have some code.
> mike
>
> On Sat, Oct 17, 2009 at 1:25 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>> "jamesmikedupont@xxxxxxxxxxxxxx" <jamesmikedupont@xxxxxxxxxxxxxx> writes:
>>
>>> What do you think of my idea to create blames along a specific user
>>> defined byte positions ?
>>
>> Overly complicated and not enough time for _review_.  If you are blaming
>> one-byte (or one-char) per line, wouldn't it be enough to consider the
>> line number in the output as byte (or char) position when reconstituting
>> the original text?
>>
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]