Am 09.05.20 um 08:19 schrieb Brandon Williams: > Here's the setup: > tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8 > tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6 > blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 > > $ git ls-tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8 > 100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 hello > 100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 hello.c > 040000 tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6 hello > Am I correct in assuming that this object is indeed invalid and should be > rejected by fsck? I'd say yes twice -- what good is a tree that you can't check out because it contains a d/f conflict? So I got curious if such trees might be in popular repos, wrote the patch below and checked around a bit, but couldn't find any. Is there a smarter way to check for duplicates? One that doesn't need allocations? Perhaps by having a version of tree_entry_extract() that seeks backwards somehow? --- fsck.c | 10 ++++++++++ t/t1450-fsck.sh | 16 ++++++++++++++++ 2 files changed, 26 insertions(+) diff --git a/fsck.c b/fsck.c index 087a7f1ffc..f47b35fee8 100644 --- a/fsck.c +++ b/fsck.c @@ -587,6 +587,8 @@ static int fsck_tree(const struct object_id *oid, struct tree_desc desc; unsigned o_mode; const char *o_name; + struct string_list names = STRING_LIST_INIT_NODUP; + size_t nr; if (init_tree_desc_gently(&desc, buffer, size)) { retval += report(options, oid, OBJ_TREE, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); @@ -680,8 +682,16 @@ static int fsck_tree(const struct object_id *oid, o_mode = mode; o_name = name; + string_list_append(&names, name); } + nr = names.nr; + string_list_sort(&names); + string_list_remove_duplicates(&names, 0); + if (names.nr != nr) + has_dup_entries = 1; + string_list_clear(&names, 0); + if (has_null_sha1) retval += report(options, oid, OBJ_TREE, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1"); if (has_full_path) diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index 449ebc5657..91a6e34f38 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -257,6 +257,22 @@ test_expect_success 'tree object with duplicate entries' ' test_i18ngrep "error in tree .*contains duplicate file entries" out ' +test_expect_success 'tree object with dublicate names' ' + test_when_finished "remove_object \$blob" && + test_when_finished "remove_object \$tree" && + test_when_finished "remove_object \$badtree" && + blob=$(echo blob | git hash-object -w --stdin) && + printf "100644 blob %s\t%s\n" $blob x.2 >tree && + tree=$(git mktree <tree) && + printf "100644 blob %s\t%s\n" $blob x.1 >badtree && + printf "100644 blob %s\t%s\n" $blob x >>badtree && + printf "040000 tree %s\t%s\n" $tree x >>badtree && + badtree=$(git mktree <badtree) && + test_must_fail git fsck 2>out && + test_i18ngrep "$badtree" out && + test_i18ngrep "error in tree .*contains duplicate file entries" out +' + test_expect_success 'unparseable tree object' ' test_oid_cache <<-\EOF && junk sha1:twenty-bytes-of-junk -- 2.26.2