On 02/18/2016 01:29 AM, David Turner wrote: > On Fri, 201-02-12 at 15:09 +0100, Michael Haggerty wrote:] >> On 02/05/2016 08:44 PM, David Turner wrote: >>> Before committing ref updates, split symbolic ref updates into two >>> parts: an update to the underlying ref, and a log-only update to >>> the >>> symbolic ref. This ensures that both references are locked >>> correctly >>> while their reflogs are updated. >>> >>> It is still possible to confuse git by concurrent updates, since >>> the >>> splitting of symbolic refs does not happen under lock. So a >>> symbolic ref >>> could be replaced by a plain ref in the middle of this operation, >>> which >>> would lead to reflog discontinuities and missed old-ref checks. >> >> This patch is doing too much at once for my little brain to follow. >> >> My first hangup is the change to setting RESOLVE_REF_NO_RECURSE >> unconditionally in lock_ref_sha1_basic(). I count five callers of >> that >> function and see no justification for why the change is OK in the >> context of each caller. Here are some thoughts: >> >> * The call from files_create_symref() sets REF_NODEREF, so it is >> unaffected by this change. > > Yes. > >> * The call from files_transaction_commit() is preceded by a call to >> dereference_symrefs(), which I assume effectively replaces the need >> for >> RESOLVE_REF_NO_RECURSE. > > Yes. > >> * There are two calls from files_rename_ref(). Why is it OK to do >> without RESOLVE_REF_NO_RECURSE there? >> >> * For the oldrefname call, I suppose the justification is the >> "(flag & >> REF_ISSYMREF)" check earlier in the function. (But does this >> introduce a >> significant TOCTOU race?) > > The refs code as a whole seems likely to have TOCTOU issues. In > general, anywhere we check/set flag & REF_ISSYMREF without holding a > lock, we have a potential problem. I haven't generally tried to handle > these cases, since they're not presently handled. I agree that we don't do so well here, though I think that most races would result in reading/writing a ref that was pointed to by the symref a moment ago, which is usually indistinguishable to the user from their update having gone through the moment before the symref was updated. So I don't think your change makes this bit of code significantly worse. > The central problem with this area of the code is that commit interacts > so intimately with the locking machinery. I understand some of why > it's done that way. In particular, your change to ref locking to not > hold lots of open files was a big win for us at Twitter. But this > means that it's hard to deal with cross-backend ref updates: you want > to hold multiple locks, and backends don't have the machinery for it. > > We could add backend hooks to specifically lock and unlock refs. Then > the backend commit code would just be handled a bundle of locked refs > and would commit them. This might be hairy, but it could fix the > TOCTOU problems. So, first lock the outer refs, then split out updates > for any which are symbolic refs, and lock those. Finally, commit all > updates (split by backend). As chance would have it, for an internal GitHub project I've implemented hooks that can be called *during* a ref transaction. The hooks can, for example, take arbitrary actions between the time that the reflocks are all acquired and the time that the updates start to be committed. I didn't submit this code upstream because I didn't think that it would benefit other users, but many it would be useful for implementing split-backend reference transaction commits. E.g., the primary reference transaction could run the secondary backend's commit while holding the locks for the primary backend references. Let me think about it. I don't think this is urgent though. The current code is not significantly racy in mainstream usage scenarios, right? > One downside of this is that right now, the backend API is relatively > close to the front-end, and this would leak what should be an > implementation detail. But maybe this is necessary to knit multiple > backends together. > > But I'm not sure that this is necessary right now, because I'm not sure > that I'm actually making TOCTOU issues much worse. Agreed. > [...] > That's a legit complaint. The problem, as you note, is that doing some > of these steps completely independently doesn't work. But I'll try > splitting out what I can. Thanks! Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html