On Thu, Jan 18, 2024 at 09:02:30AM +0100, R. Diez wrote: > Hi all: > > I have been hit by an unfortunate system problem, and as a result, a > few files in my Git repository got corrupted on my last git push. Some > random blocks of bytes were overwritten with binary zeros, so I > started getting weird unpacking errors etc. > > It took a while to realise what the problem was. During my > investigation, I ran "git fsck", which reported no problems, and then > "git push" failed. > > One of the very few corrupted files was packed-refs. This is a text > file, so it was easy to compare it and see the corrupting binary > zeros. But that made me wonder what "git fsck" checks. Can you maybe expand a bit on how you arrived at this bug? Was this a hard crash of the system that corrupted the repository or rather something like actual disk corruption? I'm mostly asking because I have been fixing some sources of refdb corruption: - bc22d845c4 (core.fsync: new option to harden references, 2022-03-11) started to fsync loose refs to disk before renaming them into place, released with Git v2.36. - ce54672f9b (refs: fix corruption by not correctly syncing packed-refs to disk, 2022-12-20) started to sync packed-refs to disk before renaming them into place, released with Git v2.40 and backported to Git v2.39.3. So if: - you use a journaling filesystem, - you didn't disable `core.fsync`, - you use Git v2.40 or newer, then you should in theory not run into any refdb corruption anymore. At least we didn't experience corruption anymore at GitLab.com, whereas before we encountered corruption every so often. > I am guessing that "git fsck" does not check file packed-refs at all. > I mean, it does not even attempt to parse it, in order to check > whether at least the format makes any sense. Only "git push" does it. Indeed it doesn't. While the issue is comparatively easy to spot by manually inspecting the `packed-refs` file, I agree that it would be great if git-fsck(1) knew how to check the refdb for consistency. This problem is only going to get worse once the upcoming reftable backend lands -- it is a binary format, and just opening it with a text editor to check whether it looks sane-ish stops being a viable option here. In fact, I already planned to introduce such consistency checks for the refdb soonish. Once the reftable backend is upstream I will focus more on additional tooling to support it, and extending our consistency checks is one of the first items on my todo list here. > What other parts of the repository does "git fsck" not check then? There may be some metadata and cache-like data structures that we don't check, but the object database is checked by default. > The repository check is suspiciously fast. Is there a slow way to > check that a repository is fine? I mean, something along the lines of > checking whether every commit can be checked out without problems. Other than running `git fsck --full --strict`: not that I'm aware of. And `--full` isn't even needed because it's the default. Patrick
Attachment:
signature.asc
Description: PGP signature