Re: [PATCH v4 2/2] sequencer: fix quoting in write_author_script

Phillip Wood <phillip.wood@xxxxxxxxxxxx> · Thu, 9 Aug 2018 11:08:40 +0100

Hi Eric

On 08/08/18 09:43, Eric Sunshine wrote:
> On Tue, Aug 7, 2018 at 9:54 AM Phillip Wood <phillip.wood@xxxxxxxxxxxx> wrote:
>> On 07/08/18 11:23, Eric Sunshine wrote:
>>> On Tue, Aug 7, 2018 at 5:35 AM Phillip Wood <phillip.wood@xxxxxxxxxxxx> wrote:
>>>> +       if (n > 0 && s[n] != '\'')
>>>> +               return 1;
>>>
>>> To be "technically correct", I think the condition in the 'if'
>>> statement should be ">=". It should never happen in practice that the
>>> entire content of the file is a single character followed by zero or
>>> more newlines, but using the proper condition ">=" would save future
>>> readers of this code a "huh?" moment.
>>
>> I'm not sure it is that simple. If the script consists solely of a
>> single quote then we should return 1, if it is a single character that
>> is not "'" then it should return 0. Currently it returns 0 in both those
>> cases so is technically broken when the script is "'". If it used ">="
>> instead then I think it would return 0 when it should return 1 and vice
>> versa. As you say this shouldn't happen in practice.
> 
> It shouldn't happen in practice, but if a human (power user) edits
> this file, we shouldn't discount the possibility. However, I'm not so
> concerned about quoting_is_broken() returning a meaningful answer in
> this case since we have much bigger problems if we get such a file.
> (Indeed, what answer could quoting_is_broken() return which would be
> useful or meaningful for such a malformed file?)
> 
> What does concern me is that read_env_script() doesn't seem to care
> about such a malformed file; it doesn't do any validation at all.
> Contrast that with read_author_ident() which is pretty strict about
> the content it expects to find in the file. So, it might make sense to
> upgrade read_env_script() to do some sort of validation on each line
> (though that shouldn't be in this patch, and doesn't even need to be
> in this series). For instance, it could check that each line looks
> something like what would be matched by this regex: /[A-Z0-9_]+='.+'/

I think it should just share some code with get_author_ident() that uses
sq_dequote(), that's for another day though.

> (And, no, I'm not saying that regex should be used for validation; I'm
> just using it as an example.)
> 
> As for '>' vs. '>=', it caused more than a slight hiccup when I was
> scanning the code, and I worry that future readers could be similarly
> impacted.

I don't have a strong opinion either way, if Junio wants to fix it up
I'd be happy with that, but I'm not keen on another iteration just for that.

Best Wishes

Phillip