Re: [RFC/PATCH 08/12] mktag: use fsck instead of custom verify_tag()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 26 2020, Jeff King wrote:

> On Thu, Nov 26, 2020 at 02:28:50AM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> There was other "mktag" validation logic that I think makes sense to
>> just remove. Namely:
>> 
>>  A. fsck only cares that the timezone matches [-+][0-9]{4}. The mktag
>>     code disallowed values larger than 1400.
>> 
>>     Yes there's currently no timezone with a greater offset[2], but
>>     since we allow any number of non-offical timezones (e.g. +1234)
>>     passing this through seems fine. Git also won't break in the
>>     future if e.g. French Polynesia decides it needs to outdo the Line
>>     Islands when it comes to timezone extravagance.
>
> Yeah, I think this is a good choice to loosen.
>
>>  B. fsck allows missing author names such as "tagger <email>", mktag
>>     wouldn't, but would allow e.g. "tagger <email>" (but not "tagger
>>     <email>"). Now we allow all of these.
>
> Likewise, though I am confused. Should the second "tagger <email>" in
> that paragraph have something else in it?

There should be 1-2 and 3 spaces there, don't know why that didn't make
it. Will fix.

>>  C. Like B, but "mktag" disallowed spaces in the <email> part, fsck
>>     allows it.
>
> Possibly something we'd want to tighten in fsck, but I think keeping
> them in alignment is a good idea for now.
>
>> We didn't only lose obscure validation logic, we also gained some:
>> 
>>  D. fsck disallows zero-padded dates, but mktag didn't care. So
>>     e.g. the timestamp "0000000000 +0000" produces an error now. A
>>     test in "t1006-cat-file.sh" relied on this, it's been changed to
>>     use "hash-object" (without fsck) instead.
>
> Seems reasonable.
>
>> +	/* verify_tag() will be removed in the next commit */
>> +	verify_tag("", 0);
>> +
>> +	/*
>> +	 * Fake up an object for fsck_object()
>> +	 */
>> +	obj.parsed = 1;
>> +	obj.type = OBJ_TAG;
>
> I don't love this "fake object struct on the stack" thing. I can't think
> of anything that would break outright, but it may be the only place
> where that struct isn't coming from the usual pool, and representing the
> common part of a larger object. Two definite gotchas, though:
>
>   - if the type is OBJ_TAG, then it may get cast to a "struct tag" by
>     other code, which could look past the end of the struct. I think
>     that fsck_object() doesn't do this, but it could (and it definitely
>     used to)
>
>   - you don't initialize the other fields. We'll definitely pass
>     &obj->oid in the fsck code, and even back to our report() callback,
>     even though it's full of garbage. In practice this is OK because our
>     custom report() won't look at them, but it seems awfully fragile.
>
> I recently genericized the type-specific fsck_* functions so that they
> just need an oid, and not an object struct. I think I didn't do
> fsck_object() just because it didn't have any callers where it mattered.
> So I think it would make sense here to either:
>
>   - make fsck_tag() specifically available outside of fsck.c, so could
>     call it directly
>
>   - convert fsck_object() to take an oid rather than an object struct.
>     It only uses the object itself to check for NULL-ness. Looking at
>     the callers, they all have non-NULL objects already. So I think that
>     check can't be triggered.
>
>>  check_verify_failure 'Tag object length check' \
>> -	'^error: .*size wrong.*$'
>> +	'^error: missingObject:'

When I started out hacking on this I just made fsck_tag() a non-static
function. The faking up obj only got added because I needed to call
fsck_object().

Between this and your comment to 12/12 I wonder if the least bad thing
isn't just to split up fsck_tag() into fsck_tag() and
fsck_tag_standalone(), so I can call the latter with a "populate the
object and type" argument, and fsck.c will just NULL those two
parameters since it doesn't care.

Gets rid of the obj faking, and gets rid of the custom parser in
mktag.c. It'll only need to do the object lookup at that point, and even
then since I've added everything else to fsck.c I might as well make
that another custom ERROR/EXTRA, i.e. have it learn to do the lookup
itself.

> We may want to enhance the "error:" here to make it clear we're talking
> about a format error in the tag input. Maybe:
>
>   error: tag input does not pass fsck: missingObject: ...
>
> or something.

*nod*




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux