Thanks for the very detailed answer. On Fri, Jan 11, 2013 at 10:12 PM, Jeff King <peff@xxxxxxxx> wrote: > On Fri, Jan 11, 2013 at 04:40:38PM +0530, Sitaram Chamarty wrote: > >> I find a lot of info on how to recover from and/or repair a repo that >> has missing (or corrupted) objects. >> >> What I need is info on common reasons (other than disk errors -- we've >> checked for those) for such errors to occur, any preventive measures >> we can take, and so on. > > I don't think any race can cause corruption of the object or packfiles > because of the way they are written. At GitHub, every case of file-level > corruption we've seen has been a filesystem issue. > > So I think the main thing systemic/race issue to worry about is missing > objects. And since git only deletes objects during a prune (assuming you > are using git-gc or "repack -A" so that repack cannot drop objects), I > think prune is the only thing to watch out for. No one runs anything manually under normal conditions. If there's any gc happening, it's gc --auto. > The --expire time saves us from the obvious races where you write object > X but have not yet referenced it, and a simultaneous prune wants to > delete it. However, it's possible that you have an old object that is > unreferenced, but would become referenced as a result of an in-progress > operation. For example, commit X is unreferenced and ready to be pruned, > you build commit Y on top of it, but before you write the ref, git-prune > removes X. > > The server-side version of that would happen via receive-pack, and is > even more unlikely, because X would have to be referenced initially for > us to advertise it. So it's something like: > > 1. The repo has a ref R pointing at commit X. > > 2. A user starts a push to another ref, Q, of commit Y that builds on > X. Git advertises ref R, so the sender knows they do not need to > send X, but only Y. The user then proceeds to send the packfile > (which might take a very long time). > > 3. Meanwhile, another user deletes ref R. X becomes unreferenced. The gitolite logs show that no deletion of refs has happened. > 4. After step 3 but before step 2 has finished, somebody runs prune > (this might sound unlikely, but if you kick off a "gc" job after > each push, or after N pushes, it's not so unlikely). It sees that > X is unreferenced, and it may very well be older than the --expire > setting. Prune deletes X. > > 5. The packfile in (2) arrives, and receive-pack attempts to update > the refs. > > So it's even a bit more unlikely than the local case, because > receive-pack would not otherwise build on dangling objects. You have > to race steps (2) and (3) just to create the situation. > > Then we have an extra protection in the form of > check_everything_connected, which receive-pack runs before writing the > refs into place. So if step 4 happens while the packfile is being sent > (which is the most likely case, since it is the longest stretch of > receive-pack's time), we would still catch it there and reject the push > (annoying to the user, but the repo remains consistent). > > However, that's not foolproof. We might hit step 4 after we've checked > that everything is connected but right before we write the ref. In which > case we drop X, which has just become referenced, and we have a missing > object. > > So I think it's possible. But I have never actually seen it in practice, > and come up with this scenario only by brainstorming "what could go > wrong" scenarios. > > This could be mitigated if there was a "proposed refs" storage. > Receive-pack would write a note saying "consider Y for pruning purposes, > but it's not really referenced yet", check connectivity for Y against > the current refs, and then eventually write Y to its real ref (or reject > it if there are problems). Prune would either run before the "proposed" > note is written, which would mean it deletes X, but the connectivity > check fails. Or it would run after, in which case it would leave X > alone. > >> For example, can *any* type of network error or race condition cause >> this? (Say, can one push writes an object, then fails an update >> check, and a later push succeeds and races against a gc that removes >> the unreachable object?) Or... the repo is pretty large -- about 6-7 >> GB, so could size cause a race that would not show up on a smaller >> repo? > > The above is the only open issue I know about. I don't think it is > dependent on repo size, but the window is widened for a really large > push, because rev-list takes longer to run. It does not widen if you > have receive.fsckobjects set, because that happens before we do the > connectivity check (and the connectivity check is run in a sub-process, > so the race timer starts when we exec rev-list, which may open and mmap > packfiles or otherwise cache the presence of X in memory). > >> Anything else I can watch out for or caution the team about? > > That's the only open issue I know about for missing objects. > > There is a race with simultaneously deleting and packing refs. It > doesn't cause object db corruption, but it will cause refs to "rewind" > back to their packed versions. I have seen that one in practice (though > relatively rare). I fixed it in b3f1280, which is not yet in any > released version. This is for the packed-refs file right? And it could result in a ref getting deleted right? I said above that the gitolite logs say no ref was deleted. What if the ref "deletion" happened because of this race, making the rest of your 4-step scenario above possible? >> The symptom is usually a disk space crunch caused by tmp_pack_* files >> left around by auto-gc. Presumably the missing objects failed the gc >> and so it left the files around, and they eventually accumulate enough >> to cause disk full errors. (If a gc ever succeeded, I suspect these >> files would go away, but that requires manual intervention). > > Gc shouldn't be leaving around tmp_pack_* unless it is actually dying > during the pack-objects phase. In my experience, stale tmp_pack_* > objects are more likely a sign of a failed push (e.g., the client hangs Oh. I did not know that! That explains why I sometimes saw that even when there were less than 6700 loose objects (i.e., auto-gc should not have kicked in at all). > up in the middle, or fsck rejects the pack). We have historically left fsck... you mean if I had 'receive.fsckObjects' true, right? I don't. Should I? Would it help this overall situation? As I understand it, thats only about the internals of each object to check corruption, and cannot detect a *missing* object on the local object store. > them in place to facilitate analysis of the failure. > > At GitHub, we've taken to just cleaning them up aggressively (I think > after an hour), though I am tempted to put in an optional signal/atexit OK; I'll do the same then. I suppose a cron job is the best way; I didn't find any config for expiring these files. Thanks again for your help. I'm going to treat it (for now) as a disk/fs error after hearing from you about the other possibility I mentioned above, although I find it hard to believe one repo can be hit buy *two* races occurring together! > handler to clean them up when index-pack fails. The forensics are > occasionally interesting (e.g., finding weird fsck problems), but large > pushes can waste a lot of disk space in the interim. > > -Peff -- Sitaram -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html