Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 17 Mar 2015 01:48:00 +0000, Jeff King wrote:
> On Mon, Mar 16, 2015 at 10:35:18PM -0700, Junio C Hamano wrote:
> 
> > > It looks like we don't even really care about the value of HEAD. We just
> > > want to know "is it a git directory?". I think in other places (like
> > > "git add"), we just do an existence check for "$dir/.git". That would
> > > not catch a bare repository, but I do not think the current check does
> > > either (it is looking for submodules, which always have a .git).
> > 
> > If we wanted to be consistent, perhaps we should be reusing the "is
> > this a git repository?" check used by the auto-discovery codepath
> > (setup.c:is_git_directory(), perhaps?), but the idea looks simple
> > enough and sounds sensible.
> 
> Yeah, I almost suggested that, but I'm concerned that would make us
> inconsistent with how we report untracked files. I thought that dir.c
> used ".git" as a magic token there.
> 
> But it seems I'm wrong. We do ignore ".git" directly in treat_path(),
> but treat_directory actually checks resolve_gitlink_ref. I think this
> will suffer the same problem as Andreas's original issue (e.g., if you
> run "git ls-files -o").

Guess what landed on my desk this week. Same repo, same
application test suite, same problem, now with 'git ls-files -o'.

> Likewise, I think dir.c:remove_dir_recurse is in a similar boat.
> Grepping for resolve_gitlink_ref, it looks like there may be others,
> too.

Can't we handle this in resolve_gitlink_ref itself? As I understand it,
it should resolve a ref (here "HEAD") when path points to a submodule.
When there isn't one it should return -1, so:

diff --git a/refs.c b/refs.c
index 132eff5..f8648c5 100644
--- a/refs.c
+++ b/refs.c
@@ -1553,6 +1553,10 @@ int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *sh
 	if (!len)
 		return -1;
 	submodule = xstrndup(path, len);
+	if (!is_git_directory(submodule)) {
+		free(submodule);
+		return -1;
+	}
 	refs = get_ref_cache(submodule);
 	free(submodule);

I'm way too little into the code to see what may this may get wrong.

But this, as well as the old hash-ref-cache patch speeds me
up considerably, in this case a git ls-files -o from half a
minute of mostly user CPU to a second.

> All of these should be using the same test, I think. Doing that with
> is_git_directory() is probably OK. It is a little more expensive than we
> might want for mass-use (it actually opens and parses the HEAD file in
> each directory),

This happens as well when we let resolve_gitlink_ref run its old course.
(It (ls-files) even seems to try to open .git and then .git/HEAD, even
if the former fails with ENOENT.)

Andreas

-- 
"Totally trivial. Famous last words."
From: Linus Torvalds <torvalds@*.org>
Date: Fri, 22 Jan 2010 07:29:21 -0800
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]