Re: [PATCH v6 0/3] read-cache: speed up add_index_entry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 4/7/2017 12:46 AM, Jeff King wrote:
On Thu, Apr 06, 2017 at 04:34:39PM +0000, git@xxxxxxxxxxxxxxxxx wrote:

Teach add_index_entry_with_check() and has_dir_name()
to avoid index lookups if the given path sorts after
the last entry in the index.

This saves at least 2 binary searches per entry.

This improves performance during checkout and read-tree because
merge_working_tree() and unpack_trees() processes a list of already
sorted entries.

Just thinking about this algorithmically for a moment. You're saving the
binary search when the input is given in sorted order. But in other
cases you're adding an extra strcmp() before the binary search begins.
So it's a tradeoff.

How often is the input sorted?  You save O(log n) strcmps for a "hit"
with your patch, and one for a "miss". So it's a net win if we expect at
least 1/log(n) of additions to be sorted (I'm talking about individual
calls, but it should scale linearly either way over a set of n calls).

I have no clue if that's a reasonable assumption or not.

I was seeing checkout call merge_working_tree to iterate over the
source index/trees and call add_index_entry() for each.  For example,
in a "checkout -b" like operation where both sides are the same, this
calls keep_entry() which appends the entry to the new index array.
The append path should always be taken because the iteration is being
driven from a sorted list.

I would think calls to add/stage individual files arrive in random
order, so I'm not suggesting replacing the code -- just checking the
end first.

Jeff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]