Re: [PATCH 10/16] mktree: overwrite duplicate entries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick Steinhardt wrote:
> On Tue, Jun 11, 2024 at 06:24:42PM +0000, Victoria Dye via GitGitGadget wrote:
>> From: Victoria Dye <vdye@xxxxxxxxxx>
>>
>> If multiple tree entries with the same name are provided as input to
>> 'mktree', only write the last one to the tree. Entries are considered
>> duplicates if they have identical names (*not* considering mode); if a blob
>> and a tree with the same name are provided, only the last one will be
>> written to the tree. A tree with duplicate entries is invalid (per 'git
>> fsck'), so that condition should be avoided wherever possible.
>>
>> Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx>
>> ---
>>  Documentation/git-mktree.txt |  8 ++++---
>>  builtin/mktree.c             | 45 ++++++++++++++++++++++++++++++++----
>>  t/t1010-mktree.sh            | 36 +++++++++++++++++++++++++++--
>>  3 files changed, 80 insertions(+), 9 deletions(-)
>>
>> diff --git a/Documentation/git-mktree.txt b/Documentation/git-mktree.txt
>> index fb07e40cef0..afbc846d077 100644
>> --- a/Documentation/git-mktree.txt
>> +++ b/Documentation/git-mktree.txt
>> @@ -43,9 +43,11 @@ OPTIONS
>>  INPUT FORMAT
>>  ------------
>>  Tree entries may be specified in any of the formats compatible with the
>> -`--index-info` option to linkgit:git-update-index[1]. The order of the tree
>> -entries is normalized by `mktree` so pre-sorting the input by path is not
>> -required.
>> +`--index-info` option to linkgit:git-update-index[1].
>> +
>> +The order of the tree entries is normalized by `mktree` so pre-sorting the input
>> +by path is not required. Multiple entries provided with the same path are
>> +deduplicated, with only the last one specified added to the tree.
> 
> Hm. I'm not sure whether this is a good idea. With git-mktree(1) being
> part of our plumbing layer, you can expect that it's mostly going to be
> fed input from scripts. And any script that generates duplicate tree
> entries is broken, but we now start to paper over such brokenness
> without giving the user any indicator of this. As user of git-mktree(1)
> in Gitaly I can certainly say that I'd rather want to see it die instead
> of silently fixing my inputs so that I start to notice my own bugs.

'git mktree' already does some cleaning of the inputs by sorting the
entries, presumably so that a valid tree is created rather than one with
ordering errors. Deduplication is also a cleanup of user inputs to ensure a
valid tree is created, so to me it's a consistent extension to existing
behavior. Conversely, rejecting the inputs and failing would be introducing
an error scenario where none existed previously, which to me would be a
bigger deviation.

One potential way to get the kind of functionality you're looking for,
though, might be to combine something like '--literally' and a '--strict'
that validates the tree before writing. Like I mentioned in the cover letter
[1], I do plan to submit a follow-up series with '--strict' (it's just that
this series is already pretty long and it would add 4-ish more patches). 

[1] https://lore.kernel.org/git/pull.1746.git.1718130288.gitgitgadget@xxxxxxxxx/

> So without seeing a strong motivating usecase for this feature I'd think
> that git-mktree(1) should reject such inputs and return an error such
> that the user can fix their tooling.

Practically, there are a couple of reasons that led me to wanting this
behavior. One is that it allows using data structures with more rigid
integrity checks (like the index & cache tree). The other is that, once the
ability to add nested entries is introduced, the concept of a "duplicate"
gets fuzzier and blocking them entirely could lead to inconsistencies and/or
limited flexibility. If, for example a user wants to create a tree with a
directory 'folder1/' with OID '0123456789012345678901234567890123456789',
but update a blob 'folder1/file1' in it to OID
'0987654321098765432109876543210987654321', the latter is technically a
"duplicate" but rejecting it would avoid being able to create the tree
without first expanding 'folder1/'with something like 'ls-tree', replacing the
appropriate entry, then calling 'mktree'.

> 
> Patrick






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux