On Fri, Jan 13, 2017 at 04:15:42AM -0500, John Szakmeister wrote: > > I did notice another interesting case when looking at this. Fsck ends up > > in fsck_loose(), which has the sha1 and path of the loose object. It > > passes the sha1 to fsck_sha1(), and ignores the path entirely! > > > > So if you have a duplicate copy of the object in a pack, we'd actually > > find and check the duplicate. This can happen, e.g., if you had a loose > > object and fetched a thin-pack which made a copy of the loose object to > > complete the pack). > > > > Probably fsck_loose() should be more picky about making sure we are > > reading the data from the loose version we found. > > Interesting find! Thanks for the information Peff! So I figured I would knock this out as a fun morning exercise. But sheesh, it turned out to be a slog, because most of the functions rely on map_sha1_file() to convert the sha1 to an object path at the lowest level. But I finally got something working, so here it is. I found another bug on the way, along with a few cleanups. And then I did the trailing garbage detection at the end, because by that point I knew right where it needed to go. :) [1/6]: t1450: refactor loose-object removal [2/6]: sha1_file: fix error message for alternate objects [3/6]: t1450: test fsck of packed objects [4/6]: sha1_file: add read_loose_object() function [5/6]: fsck: parse loose object paths directly [6/6]: fsck: detect trailing garbage in all object types builtin/fsck.c | 46 +++++++++++---- cache.h | 13 ++++ sha1_file.c | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++----- t/t1450-fsck.sh | 86 +++++++++++++++++++++++---- 4 files changed, 284 insertions(+), 41 deletions(-) -Peff