On 5/10/2018 3:00 AM, Junio C Hamano wrote:
Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes:
We use the lockfile API to avoid multiple Git processes from writing to
the commit-graph file in the .git/objects/info directory. In some cases,
this directory may not exist, so we check for its existence.
The existing code does the following when acquiring the lock:
1. Try to acquire the lock.
2. If it fails, try to create the .git/object/info directory.
3. Try to acquire the lock, failing if necessary.
The problem is that if the lockfile exists, then the mkdir fails, giving
an error that doesn't help the user:
"fatal: cannot mkdir .git/objects/info: File exists"
Isn't a better immediate fix to make the second step pay attention
to errno? If mkdir() failed due to EEXIST, then we know we tried to
aquire the lock already, so we can die with an appropriate message.
That way, we can keep the "optimize for the normal case" that the
approach to assume object/info/ directory is already there, instead
of always checking its existence which is almost always true
beforehand.
This "optimize for the normal case" is why the existing code is
organized the way it is.
Since this code is only for writing a commit-graph file, this "check the
directory first" option is a very small portion of the full time to
write the file, so the "optimization" has very little effect,
relatively. My personal opinion is to make the code cleaner when the
performance difference is negligible.
I'm willing to concede this point and use the steps you suggest below,
if we think this is the best way forward.
Also, can't we tell why we failed to acquire the lock at step #1?
Do we only get a NULL that says "I am not telling you why, but we
failed to lock"?
To tell why we failed to acquire the lock, we could inspect "errno".
However, this requires whitebox knowledge of both the lockfile API and
the tempfile API to know that the last system call to set errno was an
open() or adjust_shared_perm(). To cleanly make decisions based on the
reason the lock failed to acquire, I think we would need to modify the
lockfile and tempfile APIs to return a failure reason. This could be
done by passing an `int *reason`, but the extra noise in these APIs is
likely not worth the change.
What I am getting at is that the ideal sequence
would be more like:
1. Try to acquire the lock.
2-a. if #1 succeeds, we are happy. ignore the rest and return
the lock.
2-b. if #1 failed because object/info/ did not exist,
mkdir() it, and die if we cannot, saying "cannot mkdir".
if mkdir() succeeds, jump t 3.
2-c. if #1 failed but that is not due to missing object/info/,
die saying "cannot lock".
3. Try to acquire the lock.
4-a. if #3 succeeds, we are happy.ignore the rest and return
the lock.
4-b. die saying "cannot lock".
Thanks,
-Stolee