Re: cherry-pick is slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Unfortunately, I do not think that the actual implementation of
> "cherry-pick" matches that expectation, as it is a full three-way merge.
>
> I am somewhat curious to see what the performance characteristics would be
> if the same commit is replayed using
>
> 	git format-patch -1 --stdout $commit | git apply --index --3way
>
> pipeline.  Depending on the number of paths in the whole tree vs the
> number of paths the $commit touches, I wouldn't be surprised if it is
> faster.

An unscientific datapoint shows that with a project as small as the kernel,
the difference is noticeable.

For example, v3.4-rc7-22-g3911ff3 (random tip of the day) touches two
paths, and cherry-picking it on top of v3.3 goes like this:

    $ git checkout v3.3 && EDITOR=: /usr/bin/time git cherry-pick 3911ff3
     Author: Jiri Kosina <jkosina@xxxxxxx>
     2 files changed, 2 insertions(+)
    1.08user 0.20system 0:01.28elapsed 99%CPU (0avgtext+0avgdata 469728maxresident)k
    0inputs+7536outputs (0major+52604minor)pagefaults 0swaps

as opposed to an alternative that touches only these two paths:

    $ git checkout v3.3 && EDITOR=: /usr/bin/time sh -c '
	git format-patch --stdout -1 3911ff3 | git am -3'
    Applying: genirq: export handle_edge_irq() and irq_to_desc()
    0.36user 0.16system 0:00.46elapsed 112%CPU (0avgtext+0avgdata 254720maxresident)k
    0inputs+14872outputs (0major+55145minor)pagefaults 0swaps

Of course, there are vast differences between v3.3 and 3911ff3^1; 11k+
paths touched, countless paths created and deleted.

I _think_ most of the overhead comes from having to match the large trees
in unpack_trees() even though none of the changes between the base
versions matters for this" cherry-pick".

Both reads the flat index into the core in its entirety and futzing with
the index file format would not affect this comparison, even though it
could improve the performance of "am", if done right, as it could limit
its updates to only two paths.  In the merge case, we pretty much rebuild
the resulting index from scratch by walking the entire tree in
unpack_trees(), so there won't be much benefit.

Perhaps we might want to rethink the way we run merges?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]