Re: git rm VERY slow for directories with many files.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> I wonder how fast "git diff-index --cached -r HEAD --", with the
> same pathspec used for the problematic "git rm", runs in this same
> 50,000 path project.  
>
> If it runs in a reasonable time, one easy way out may be to revamp
> the codepath to call check_local_mod() to:
>
>  - first before making the call, do the "diff-index --cached" thing
>    internally with the same pathspec to grab the list of paths that
>    have local modifications; save the set of paths in a hashmap or
>    something.
>
>  - pass that hashmap to check_local_mod(), and where the function
>    does the "staged_changes" check, consult the hashmap to see the
>    path in question is different between the HEAD and the index.

And if we want to try a more localized band-aid, another approach
may be to add a caching version of get_tree_entry() where we keep
track of (stack of) tree, the path component we found during the
last call to the helper and the tree_desc.  That way, when we get
the next call, we descend that stack as long as the leading path
components are still the same, and when we see that the path
component we are looking for is different from what we used in the
last call, we either (1) reuse the tree_desc and keep going forward
if the name we looked for the last sorts before what we are looking
for, or (2) discard and reopen the tree, rewinding the tree_desc to
the beginning and do the scan.

That way, the caller of the check_local_mod() does not have to know
the trick, and because the loop in check_local_mod() iterates over
the list that is already sorted in the index order, we'd not just
reduce the number of times we open the trees but also reduce the
number of times we scan and skip the entries in trees to find the
entries we are after.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux