Re: three-way diff performance problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 21 Jul 2009, Junio C Hamano wrote:
> 
> I know why.  The conversion was wrong.  The original found the last_one
> that was from the parent we are looking at, and when there is such, it
> started the scan from the one _after that_.  Otherwise, if lost_head list
> had an entry
> 
> 	@@ -l,k +m,n @@
>         -one
>          two
>          three
> 
> and the diff with the parent we are currently looking at duplicates
> removal:
> 
> 	@@ -l,k +m1,n1 @@
> 	-one
>         -one
>          two
>          three
> 
> we will end up losing the second removal, which would be what is happening
> with the patch you tried.

Ok, that matches what I saw. Doing a 'diff' between the outputs showed 
missing lines from the result with your first patch.

> I actually was scratching my head wondering why it wasn't happening in the
> original code after I sent that faulty patch.
> 
> Here is another attempt.

Yup. This attempt now generates output that matches the original one (in 
44s), while at the same time still handling the previously very expensive 
case quickly (1.4s).

So it looks good now from my testing - I'm not going to say anything else 
about the patch, since I'm not all that familiar with the code itself.

Btw, the 'perf report' on the fixed build now says

    89.67%               git  /home/torvalds/git/git     [.] xdl_prepare_env
     6.42%               git  /home/torvalds/git/git     [.] xdl_recs_cmp
     0.92%               git  [kernel]                   [k] hpet_next_event
     0.52%               git  /home/torvalds/git/git     [.] xdl_prepare_ctx
     0.44%               git  /lib64/libz.so.1.2.3       [.] inflate_fast
     0.29%               git  /home/torvalds/git/git     [.] xdl_hash_record
     0.27%               git  /lib64/libc-2.10.1.so      [.] __GI_memcmp
     0.26%               git  /home/torvalds/git/git     [.] show_patch_diff
     0.14%               git  [kernel]                   [k] _spin_lock
     0.09%               git  [kernel]                   [k] clear_page_c
     0.09%               git  /lib64/libz.so.1.2.3       [.] adler32

so now the expensive part is xdl_prepare_env. And that does actually make 
sense: it's the code that generates all the hash records for the lines to 
be compared, so there doesn't seem to be anything hugely wrong with it. If 
there are a lot of identical lines, I'd expect that to be fairly 
expensive.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]