On 04/24/2015 01:35 PM, Michael Haggerty wrote: > The old code locked all of the references in the first loop, leaving > all of the lockfiles open until later loops. Since we might be > updating a lot of references, this could cause file descriptor > exhaustion. > > But there is no reason to keep the lockfiles open after the first > loop: > > * For references being deleted or verified, we don't have to write > anything into the lockfile, so we can close it immediately. > > * For references being updated, we can write the new SHA-1 into the > lockfile immediately and close the lockfile without renaming it into > place. In the second loop, the already-written contents can be > renamed into place using commit_ref_update(). > > To simplify the bookkeeping across loops, add a new REF_NEEDS_COMMIT > bit to update->flags, which keeps track of whether the corresponding > lockfile needs to be committed, as opposed to just unlocked. > > This change fixes two tests in t1400. I realize that I should have highlighted another, perhaps more serious bug that is fixed by this change. The old code was roughly for update in updates: acquire locks and check old_sha for update in updates: if changing value: write_ref_to_lockfile() commit_ref_update() for update in updates: if deleting value: unlink() for update in updates: if reference still locked: unlock_ref() The atomicity of the update depends on all preconditions being checked in the first loop, before any changes have start being committed. The problem is that write_ref_to_lockfile() (which used to be part of write_ref_sha1()), which happens in the second loop, checks some more preconditions: * It verifies that new_sha1 is a valid object * If the reference being updated is a branch, it verifies that new_sha1 points at a commit object (as opposed to a tag, tree, or blob). If either of these preconditions fails, the "transaction" might be aborted after some reference updates have already been permanently committed. In other words, the all-or-nothing promise of "git update-ref --stdin" and "git push --atomic" would be violated. I'm sorry that I didn't fully realize this problem earlier. After this patch, the code looks like for update in updates: acquire locks and check old_sha if changing value: write_ref_to_lockfile() for update in updates: if changing value: commit_ref_update() for update in updates: if deleting value: unlink() for update in updates: if reference still locked: unlock_ref() This has the important effect that the pre-checks in write_ref_to_lockfile() are carried out before any changes have been committed, so if any of those checks fail, the transaction can be rolled back correctly. Given that "git push --atomic" is one of the main new features of 2.4.0, it would be unfortunate for the release to contain this bug, plus the bug that atomic pushes can fail due to file descriptor exhaustion. Is this problem important enough to justify delaying the release to get the fix in? Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html