Re: [PATCH 2/2] blame: introduce -u/--unique option

Namhyung Kim <namhyung@xxxxxxxxx> · Thu, 10 Mar 2011 15:13:41 +0900

2011-03-09 (ì), 11:45 -0800, Junio C Hamano:
> Namhyung Kim <namhyung@xxxxxxxxx> writes:
> 
> > -u/--unique option will find and use minimum length of unique
> > SHA-1 name. If -l option is specified also, it will have higher
> > priority, IOW git blame will use full 40-length SHA-1 name.
> 
> > @@ -1867,6 +1878,10 @@ static void find_alignment(struct scoreboard *sb, int *option)
> >  			longest_dst_lines = num;
> >  		if (largest_score < ent_score(sb, e))
> >  			largest_score = ent_score(sb, e);
> > +		sha1 = find_unique_abbrev(suspect->commit->object.sha1,
> > +					  MINIMUM_ABBREV);
> > +		if (longest_uniq_sha1 < strlen(sha1))
> > +			longest_uniq_sha1 = strlen(sha1);
> 
> The logic to determine and keep track of the longuest-unique looks
> correct, but I was hoping that we already have an easy optimization
> codepath to do this only once per commit, not for every blame-entry in the
> result.  Doesn't the code have a similar optimization to figure out the
> necessary number of columns to show author names (I haven't read the code
> recently, though)?
> 

Right. METAINFO_SHOWN flag does that. I'll move the code under the block
in next version.

> Also we might find that the performance impact of doing this may be so
> miniscule that it is not worth wasting a short option name.  If we were to
> use an option, I was actually hoping that the option would let the users
> specify a value different from the hardcoded 8 at the same time.  E.g.
> 
>     git blame --abbrev=8 ;# current default with uniquefy applied
>     git blame --abbrev=4 ;# equivalent to your "blame -u"
> 

Hmm... I thought about that too. But you mean that you only want
--abbrev instead of -u, right?

My original intention was if user specified --abbrev explicitly, it
should be honored regardless of the uniqueness. The guard will not be
used in this situation because the user gave the exact length [s]he
wants to see.

> Can we have a benchmark of this feature in a largish and busy file in a
> project with a deep history?
> 

I gave it a try on a file in Linux kernel (with METAINFO_SHOWN
optimization applied):

$ wc -l mm/page_alloc.c
5629 mm/page_alloc.c
$ git log --oneline mm/page_alloc.c | wc -l
566
$ perf record ../git/git blame -u mm/page_alloc.c > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.229 MB perf.data (~10014 samples) ]
$ perf report
# Events: 5K cycles
#
# Overhead  Command         Shared Object                         Symbol
# ........  .......  ....................  .............................
#
    21.11%      git  git                   [.] cmd_blame
    18.34%      git  git                   [.] get_origin
    18.32%      git  git                   [.] pass_blame
     3.99%      git  libz.so.1.2.3.3       [.] inflate
     3.14%      git  git                   [.] xdl_hash_record
     2.84%      git  libz.so.1.2.3.3       [.] inflate_table
     2.60%      git  libz.so.1.2.3.3       [.] inflate_fast
     2.23%      git  git                   [.] find_pack_entry_one
     1.84%      git  git                   [.] xdl_recmatch
     1.51%      git  libc-2.11.1.so        [.] memcpy
     1.49%      git  git                   [.] xdl_prepare_env
     1.36%      git  git                   [.] xdl_prepare_ctx
     1.13%      git  [kernel.kallsyms]     [k] __lock_acquire
     1.12%      git  git                   [.] xdi_diff
     1.09%      git  git                   [.] blame_chunk
     0.99%      git  libc-2.11.1.so        [.] _int_malloc
     0.79%      git  git                   [.] tree_entry_interesting
     0.77%      git  git                   [.] origin_decref
     0.67%      git  libz.so.1.2.3.3       [.] adler32
     0.42%      git  [kernel.kallsyms]     [k] sched_clock_local
     0.40%      git  git                   [.] decode_tree_entry
     0.38%      git  [kernel.kallsyms]     [k] clear_page_c
     0.35%      git  libc-2.11.1.so        [.] __GI___strncmp_ssse3
     0.32%      git  git                   [.] lookup_object
     0.31%      git  [kernel.kallsyms]     [k] lock_release
     0.29%      git  [kernel.kallsyms]     [k] hlock_class
     0.29%      git  [kernel.kallsyms]     [k] page_fault
     0.29%      git  git                   [.] patch_delta
     0.28%      git  [kernel.kallsyms]     [k] local_clock
     0.27%      git  [kernel.kallsyms]     [k] sched_clock
     0.25%      git  [kernel.kallsyms]     [k] delay_tsc
     0.25%      git  git                   [.] parse_commit_buffer
     0.25%      git  [kernel.kallsyms]     [k] look_up_lock_class
     0.23%      git  git                   [.] xdl_cha_alloc
     0.22%      git  libc-2.11.1.so        [.] _int_free
     0.22%      git  libc-2.11.1.so        [.] malloc_consolidate
     0.22%      git  [kernel.kallsyms]     [k] mark_lock
     0.21%      git  libc-2.11.1.so        [.] __malloc
     0.20%      git  git                   [.] nth_packed_object_offset

Looks like it doesn't have a performance impact IMHO.
Hope this helps.

Thank you.

-- 
Regards,
Namhyung Kim

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html