Patrick Steinhardt wrote: > On Tue, Jun 11, 2024 at 06:24:42PM +0000, Victoria Dye via GitGitGadget wrote: >> From: Victoria Dye <vdye@xxxxxxxxxx> >> >> If multiple tree entries with the same name are provided as input to >> 'mktree', only write the last one to the tree. Entries are considered >> duplicates if they have identical names (*not* considering mode); if a blob >> and a tree with the same name are provided, only the last one will be >> written to the tree. A tree with duplicate entries is invalid (per 'git >> fsck'), so that condition should be avoided wherever possible. >> >> Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> >> --- >> Documentation/git-mktree.txt | 8 ++++--- >> builtin/mktree.c | 45 ++++++++++++++++++++++++++++++++---- >> t/t1010-mktree.sh | 36 +++++++++++++++++++++++++++-- >> 3 files changed, 80 insertions(+), 9 deletions(-) >> >> diff --git a/Documentation/git-mktree.txt b/Documentation/git-mktree.txt >> index fb07e40cef0..afbc846d077 100644 >> --- a/Documentation/git-mktree.txt >> +++ b/Documentation/git-mktree.txt >> @@ -43,9 +43,11 @@ OPTIONS >> INPUT FORMAT >> ------------ >> Tree entries may be specified in any of the formats compatible with the >> -`--index-info` option to linkgit:git-update-index[1]. The order of the tree >> -entries is normalized by `mktree` so pre-sorting the input by path is not >> -required. >> +`--index-info` option to linkgit:git-update-index[1]. >> + >> +The order of the tree entries is normalized by `mktree` so pre-sorting the input >> +by path is not required. Multiple entries provided with the same path are >> +deduplicated, with only the last one specified added to the tree. > > Hm. I'm not sure whether this is a good idea. With git-mktree(1) being > part of our plumbing layer, you can expect that it's mostly going to be > fed input from scripts. And any script that generates duplicate tree > entries is broken, but we now start to paper over such brokenness > without giving the user any indicator of this. As user of git-mktree(1) > in Gitaly I can certainly say that I'd rather want to see it die instead > of silently fixing my inputs so that I start to notice my own bugs. 'git mktree' already does some cleaning of the inputs by sorting the entries, presumably so that a valid tree is created rather than one with ordering errors. Deduplication is also a cleanup of user inputs to ensure a valid tree is created, so to me it's a consistent extension to existing behavior. Conversely, rejecting the inputs and failing would be introducing an error scenario where none existed previously, which to me would be a bigger deviation. One potential way to get the kind of functionality you're looking for, though, might be to combine something like '--literally' and a '--strict' that validates the tree before writing. Like I mentioned in the cover letter [1], I do plan to submit a follow-up series with '--strict' (it's just that this series is already pretty long and it would add 4-ish more patches). [1] https://lore.kernel.org/git/pull.1746.git.1718130288.gitgitgadget@xxxxxxxxx/ > So without seeing a strong motivating usecase for this feature I'd think > that git-mktree(1) should reject such inputs and return an error such > that the user can fix their tooling. Practically, there are a couple of reasons that led me to wanting this behavior. One is that it allows using data structures with more rigid integrity checks (like the index & cache tree). The other is that, once the ability to add nested entries is introduced, the concept of a "duplicate" gets fuzzier and blocking them entirely could lead to inconsistencies and/or limited flexibility. If, for example a user wants to create a tree with a directory 'folder1/' with OID '0123456789012345678901234567890123456789', but update a blob 'folder1/file1' in it to OID '0987654321098765432109876543210987654321', the latter is technically a "duplicate" but rejecting it would avoid being able to create the tree without first expanding 'folder1/'with something like 'ls-tree', replacing the appropriate entry, then calling 'mktree'. > > Patrick