On Sun, May 19, 2013 at 9:40 AM, Ramkumar Ramachandra <artagnon@xxxxxxxxx> wrote: > Okay, let's look at this part. > > Felipe Contreras wrote: >> diff --git a/contrib/related/git-related b/contrib/related/git-related >> new file mode 100755 >> index 0000000..4f31482 >> --- /dev/null >> +++ b/contrib/related/git-related >> @@ -0,0 +1,124 @@ >> +#!/usr/bin/env ruby >> + >> +# This script finds people that might be interested in a patch >> +# usage: git related <file> >> + >> +$since = '5-years-ago' >> +$min_percent = 10 >> + >> +def fmt_person(name, email) >> + name ? '%s <%s>' % [name, email] : email >> +end >> + >> +class Commit >> + >> + attr_reader :persons >> + >> + def initialize(id) >> + @id = id >> + @persons = [] >> + end > > Okay, although I'm wondering what id is. The commit's unique identifier. How is that not clear? >> + def parse(data) >> + msg = nil >> + data.each_line do |line| >> + if not msg >> + case line >> + when /^author ([^<>]+) <(\S+)> (.+)$/ >> + @persons << fmt_person($1, $2) >> + when /^$/ >> + msg = true >> + end > > When there's a blank line, flip the switch and start looking in the > commit message. Why though? You're worried that the commit message > will contain a line matching /^author ([^<>]+) <(\S+)> (.+)$/? If so, > why don't you split('\n', 2), look for the author in the $1 and > Whatevered-by in $2? That's not efficient, it's more efficient to parse line by line lazily, and it's not that complicated. >> + else >> + if line =~ /^(Signed-off|Reviewed|Acked)-by: ([^<>]+) <(\S+?)>$/ > > elif? It will get bigger soon enough. > Can this be more generic to include Whatevered-by? That's done in later patches. This is a minimal version. >> + @persons << fmt_person($2, $3) > > Will $2 ever be nil (from fmt_person)? ie. Why are you checking for > the special case " <\S+?>$"? Yes, '<email>' was valid in earlier versions of git. >> + end >> + end >> + end >> + @persons.uniq! > > So you don't care if the person has appeared twice or thrice in the message. Yes I do, otherwise they be counted multiple times, and their participation calculation would go beyond 100%. >> + def size >> + @items.size >> + end > > #size is reminiscent of Array#size and Hash#size. Precisely. I would even include Array if it made sense. >> + def each(&block) >> + @items.each(&block) >> + end > > I can see how you iterate over commits from the code below. However, > using #each like this is confusing, because the reader expects #each > to be invoked on an Enumerable. So, I have two suggestions: We can include Enumerable, it won't make a difference. > 1. Use block_given? to make sure that #each works even without a block > (just like Enumerator#each). It already works just like Enumerator#each; commits.each returns an Enumerator because @items.each returns an Enumerator. > 2. Mixin Enumerable by putting in an 'include Enumerable' after the > class line. Aside from clarity, you get stuff like #find for free. Stuff we don't need or want. >> + def import > > From reading the code below, your calling semantics are: first set > each @items[id] to a new Commit object. import is meant to invoke > Commit#parse and set @persons there. Okay. > >> + return if @items.empty? >> + File.popen(%w[git cat-file --batch], 'r+') do |p| >> + p.write(@items.keys.join("\n")) >> + p.close_write > > Okay. > >> + p.each do |l| >> + if l =~ /^(\h{40}) commit (\d+)/ > > s/l/line/? Fine. >> + id, len = $1, $2 >> + data = p.read($2.to_i) >> + @items[id].parse(data) >> + end >> + end >> + end >> + end >> + >> + def get_blame(source, start, len, from) >> + return if len == 0 >> + len ||= 1 > > Please don't use ||=. It is notorious for causing confusion in > Ruby-land. Hint: it's not exactly equivalent to either len = len || 1 > or len || len = 1. How exactly is it not equivalent to len = len || 1? This is what I want: len = 1 if not len And I think the above does exactly that. >> + File.popen(['git', 'blame', '--incremental', '-C', > > Still no -CCC? I forgot. >> + '-L', '%u,+%u' % [start, len], >> + '--since', $since, from + '^', >> + '--', source]) do |p| >> + p.each do |line| >> + if line =~ /^(\h{40})/ >> + id = $1 > > Use $0 and remove the parens: you're matching the whole line. No, I'm not matching the whole line, but you are right; there's no need for groups. >> + end >> +end >> + >> +count_per_person.each do |person, count| >> + percent = count.to_f * 100 / commits.size >> + next if percent < $min_percent >> + puts person > > Not going to print percentage as well? Later. This does the minimum. > Overall, significant improvement over v3 which used all kinds of > unnecessarily complex data structures and convoluted logic. It's coming. -- Felipe Contreras -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html