"brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes: >> Looking at an optimized profile, all the time seems to be spent in “get_tree_entry” — I assume there is some huge object representing the directory which is being re-expanded for each file? > > Yes, there's a tree object that represents each directory. > >> Is there any way I can speed up removing this directory? > > First, make sure your working directory is clean with no changes. Then, > remove the directory (by hand) or move it somewhere else. Then, run > "git add -u". > > That should allow you to commit the removal of those files quickly. If get_tree_entry() shows up a lot in the profile, it would indicate that a lot of cycles are spent in check_local_mod(). Bypassing it with "-f" may be the first thing to try ;-) The way "git rm" makes repeated calls to get_tree_entry() with deep pathnames would be an easy recipe to get quadratic behaviour like the one reported in the first message on this thread, as it always goes from the root level, grabs an tree object and scans it to get the entry for the next level, and (worse yet) a look-up of a path component in each of these tree object must be done as a linear scan. I wonder how fast "git diff-index --cached -r HEAD --", with the same pathspec used for the problematic "git rm", runs in this same 50,000 path project. If it runs in a reasonable time, one easy way out may be to revamp the codepath to call check_local_mod() to: - first before making the call, do the "diff-index --cached" thing internally with the same pathspec to grab the list of paths that have local modifications; save the set of paths in a hashmap or something. - pass that hashmap to check_local_mod(), and where the function does the "staged_changes" check, consult the hashmap to see the path in question is different between the HEAD and the index.