Eventually I found some time to investigate this issue ...
On Nov 22, 2007, at 6:56 PM, Steffen Prohaska wrote:
On Nov 22, 2007, at 10:58 AM, Johannes Sixt wrote:
Steffen Prohaska schrieb:
On Nov 22, 2007, at 8:52 AM, Junio C Hamano wrote:
I suspect that if there are even earlier callers than these
early parts in the codepaths (handle_options, enter_repo, and
setup_git_directory_gently), maybe these earlier callers are
doing something wrong. Logically, if you are somewhere very
early in the codepath that you can still change the value of
GIT_DIR, you shouldn't have assumed the unknown value of GIT_DIR
and cached locations relative to that directory, no? What are
the problematic callers? What values do they access and why?
I thought about these questions, too. But only very briefly.
I did not analyze the code path that lead to calls of getenv().
I'm not sure if it's really necessary. Calling set_git_dir()
looks more sensible too me than the old code. I believe using
set_git_dir() is the safer choice, and should not do any harm.
So I stopped analyzing too much, and instead proposed to use
set_git_dir().
Junio's point is this: If we stumble over a dangling pointer that
getenv() produced, then this has obviously happened before setenv
(GIT_DIR), and caching that pointer is probably the wrong thing to
do anyway (because it refers to the wrong GIT_DIR) and needs to be
fixed.
I see your point. It is probably more important to investigate
this than I recognized at a first glance.
I instrumented the code to verify if setenv(GIT_DIR) is called after
setup_git_env(). This is not the case for all tests.
I also searched for problematic code paths.
setup_git_directory_gently() looks correct. It explicitly calls
getenv(GIT_DIR_ENVIRONMENT); but uses the value returned in a
safe manner. It does not cache the result and the only code path
that calls set_git_dir() does not access the return value of the
getenv() call after the call to set_git_dir().
setup_work_tree() looks correct, too. Here, get_git_dir() is
called, which implicitly results in caching the pointer returned
from getenv(GIT_DIR_ENVIRONMENT). But the result of get_git_dir() is
neither cached nor used after a subsequent call to set_git_dir().
So, I don't find any obvious problems.
Interesting, though, is to find out if we have other potentially
dangerous calls to getenv() that are not removed by this patch.
Side note for other readers: This is a Windows specific problem
for the moment because its getenv() does not behave well.
Yes, and apparently even nobody knows how to trigger the problem
on Windows. At this point, we only know that caching getenv()
calls is unsafe, while on Unix it is safe (at least for BSD
it's documented to be safe).
In conclusion, using setenv() as in the original code instead of
set_git_dir() should be safe and this patch is not needed.
I tend to revert the changes in msysgit and see if we hit any
problems. But I'll wait until 1.5.4 is released.
Steffen
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html