[PATCH 0/3] Improving performance of git clean

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This series addresses a performance issue of git clean previously
discussed here:
http://thread.gmane.org/gmane.comp.version-control.git/265560/focus=265560
and here:
http://thread.gmane.org/gmane.comp.version-control.git/266777/focus=266777

The issue manifests when trying to clean a large number of untracked
directories. In my case this scenario triggered by a test suite
running in the repository creating a directory for each test resulting
in a build directory with ~100000 sub directories that needs to be
cleaned. For some extreme cases, clean times of more than 1h have been
observed.

With this series, the time to clean an untracked directory containing
100000 sub directories goes from 61s to 1.7s.

The main change is to switch the repository check in
clean.c:remove_dirs from using refs.c:resolve_gitlink_ref to
setup.c:is_git_directory.

One potential issue that is_git_directory contains the following check:

	if (getenv(DB_ENVIRONMENT)) {
		if (access(getenv(DB_ENVIRONMENT), X_OK))
			return 0;
	}

I'm not sure how this will affect this usecase (checking for some
other nested git repo). Please give some thought to this when
reviewing.

Jeff King also expressed concerns that we may have similar performance
issues in other commands and that it could be good to unify these "is
this a repo?"-checks. This series only attempts to solve the git-clean
case.

Erik Elfström (3):
  t7300: add tests to document behavior of clean and nested git
  p7300: added performance tests for clean
  clean: improve performance when removing lots of directories

 builtin/clean.c       | 23 ++++++++++++---
 t/perf/p7300-clean.sh | 37 +++++++++++++++++++++++
 t/t7300-clean.sh      | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 138 insertions(+), 4 deletions(-)
 create mode 100755 t/perf/p7300-clean.sh

-- 
2.4.0.rc0.37.ga3b75b3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]