Re: [PATCH 00/14] numparse module: systematically tighten up integer parsing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/24/2015 04:58 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@xxxxxxxxxxxx> writes:
> 
>> It is easy to allow "--abbrev=+7"; I would just need to add NUM_PLUS to
>> those call sites. Should I do so?
> 
> The more relevant question to ask from my point of view is why you
> need to "add" NUM_PLUS to "enable" it.  What valid reason do you
> have to forbid it anywhere?  Only because you do not accept it by
> default, you need to "add" to "enable".

I want to be able to plunge into this project without first auditing all
call sites to see which features will turn out to be needed. So I'm
erring on the side of flexibility. For now, I want to be able to
prohibit '+' signs.

Currently all of the flags cause additional features to be enabled. My
guess was that most callers *won't* need most features, so it seemed
easiest and most consistent to have all features be turned off by
default and let the caller add the features that it wants to allow.

Regarding specifically allowing/disallowing a leading '+': I saw a
couple of callsites that explicitly check that the first character is a
digit before calling strtol(). I assumed that is to disallow sign
characters [1]. For example,

    diff.c: optarg()
    builtin/apply.c: parse_num()
    maybe date.c: match_multi_number()

There are other callsites that call strtoul(), but store the result in a
signed variable. Those would presumably not want to allow leading '-',
but I'm not sure.

I also imagine that there are places in protocols or file formats where
signs should not be allowed (e.g., timestamps in commits?).

>>> Why is it a problem to allow "git cmd --hexval=0x1234", even if "git
>>> cmd --hexval=1234" would suffice?
>>
>> In some cases we would like to allow that flexibility; in some cases
>> not. But the strtol()/strtoul() functions *always* allow it.
> 
> The same issue.  Whare are these "some cases"?

I admit I'm not sure there are such places for hexadecimal numbers.

I'm coming around to an alternate plan:

Step 1: write a NUM_DEFAULT combination-of-options that gives the new
functions behavior very like strtol()/strtoul() but without their insane
features.

Step 2: rewrite all callers to use that option (and usually endptr=NULL,
meaning no trailing characters) unless it is manifestly clear that they
are already trying to forbid some other features. This will already
produce the largest benefit: avoiding overflows, missing error checking,
etc.

Steps 3 through ∞: as time allows, rewrite individual callsites to be
stricter where appropriate.

Hopefully steps 1 and 2 will not be too controversial.

Michael

[1] That assumption is based on a rather quick look over the code,
because with well over 100 callsites, it is not practical to study each
callsite carefully.

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]