Re: Suggestion: "verify/repair" option for 'git gc'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 13 2021, Alexandr Miloslavskiy wrote:

> Suggestion
> ----------
> 1) It would be nice if 'git gc' had an option to also verify
>    (like 'git fsck') the repo and report corruption. I think that it's
>    a good idea to have it in 'gc' for performance reasons, because
>    'git gc' already reads things.
>
> 2) It would be nice if git could automatically download blobs from
>    remote if local blob is corrupted. Maybe it was already implemented,
>    see story 3 below.
>
> Motivation
> ----------
>
> -- Story 1 --
> Just a few days ago I encountered another secretly broken repo which
> caused some small bugs in the git UI I'm using. The repo worked mostly
> fine, that's why I had no idea that it's corrupted.
>
> My git UI invokes 'git gc' sometimes and if that detected the
> corruption, I wouldn't have to spend time hunting the bug in UI.
>
> Specifically, it reports these errors on `git fsck`
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 1d571d7354f99b726bbcc0cb232b3f47846c71a1: broken links
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 2808b286c2a933e88735d97416e29b9514fc6af2: broken links
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 604f6f6c4fbf8da7a593708e863e68f8c5a27d07: broken links
>   error: object 0189425cc210555c36383293c468df5da73acc48 is a commit,
>   not a blob
>   error in tree 6a2c4a5ef0b0ee7aa85d88c3147b7558a6a7c29f: broken links
>
> The repo is not confidential and I could share it if needed.
> I "solved" the problem by cloning a new copy.

I'd be interested in a copy of it, I've been slowly trying to improve
these sorts of corruption cases.

> -- Story 2 --
> A few years ago, I had another repo that wasn't used for a couple years
> and had corrupted blobs. The repo looked fine until I tried to clone
> from it. Unfortunately it was the only copy and I had to write some
> code to "guess" the blob's contents to repair the repo.
>
> If 'git gc' detected corruption, I would have known about the problem
> earlier,
> when I still had other copies around.

I wonder if this and other issues you encountered wouldn't need a full
"fsck", but merely gc triggering a complete repack. Which is not to say
that some regular background "fsck" wouldn't be a good idea...

> -- Story 3 --
> Also a few years ago, I had a repo with a single corrupted blob. I don't
> remember why, but simply re-cloning it was a headache. I managed to fix repo
> by issuing a command to re-download a blob from remote. Git could totally do
> that itself, I think.

Yes, we still definitely have cases where dealing with this sort of
thing can be very painful.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux