On Tue, 2009-01-06 at 20:25 -0500, Nicolas Pitre wrote: > On Tue, 6 Jan 2009, R. Tyler Ballance wrote: > > > On Tue, 2008-12-09 at 09:36 +0100, Jan Krüger wrote: > > > For fixing a corrupted repository by using backup copies of individual > > > files, allow write_sha1_file() to write loose files even if the object > > > already exists in a pack file, but only if the existing entry is marked > > > as corrupted. > > > > I figured I'd reply to this again, since the issue cropped up again. > > > > We started experiencing *large* numbers of corruptions like the ones > > that started the thread (one developer was receiving them once or twice > > a day) with v1.6.0.4 > > > > We went ahead and upgraded to a custom build of v1.6.1 with Jan's patch > > (below) and the issues /seem/ to have resolved themselves. I'm not > > certain whether Jan's patch was really responsible, or if there was > > another issue that caused this to correct itself in v1.6.1. I'll back the patch out and redeploy, it's worth mentioning that a coworker of mine just got the issue as well (on 1.6.1). He was able to `git pull` and the error went away, but I doubt that it "magically fixed itself" > Please back it out. As it stands, that patch is a no op because of the > way git is used, and even if the patch was to work as intended, its > purpose is not to magically fix corruptions without special action from > your part. If you have corruption problems coming back only because of > the removal of this patch then something is really really fishy and I > would really like to know about it. > > There were indeed many changes between v1.6.0.4 and v1.6.1: the exact > number is 1029. A couple of them are especially addressing increased > robustness against some kind of pack corruptions. But in any case you > still should see error messages appearing about them. > > And don't underestimate the power of disk corruptions. I started to > work on git corruption resilience simply because I ended up with a > corrupted pack at some point. Then a while later I got another > corrupted pack. Then another while later I lost my filesystem entirely > and had to reinstall my system (after buying a new disk). Turns out > that my old disk is silently corrupting data without signaling any > errors to the host. I highly doubt this, I've got the issue appearing on at least 7 different development boxes (not workstations, 2U quad-core ECC RAM, etc machines), while that doesn't mean that they all don't have issues, the probability of them *all* having disk issues, and it somehow only manifesting itself with Git usage, is low ;) I've tarred one of the repositories that had it in a reproducible state so I can create a build and extract the tar and run against that to verify any patches anybody might have, but unfortunately at 7GB of company code and assets, I can't exactly share ;) Cheers -- -R. Tyler Ballance Slide, Inc.
Attachment:
signature.asc
Description: This is a digitally signed message part