Re: [PATCH 1/7] config: avoid "write_in_full(fd, buf, len) < len" pattern

Jonathan Nieder <jrnieder@xxxxxxxxx> · Wed, 13 Sep 2017 10:59:30 -0700

Jeff King wrote:
> On Wed, Sep 13, 2017 at 10:47:28AM -0700, Jonathan Nieder wrote:
>> Jeff King wrote:

>>> I scoured the code base for cases of this, but it turns out
>>> that these two in git_config_set_multivar_in_file_gently()
>>> are the only ones. This case is actually quite interesting:
>>> we don't have a size_t, but rather use the subtraction of
>>> two pointers. Which you might think would be a signed
>>> ptrdiff_t, but clearly both gcc and clang treat it as
>>> unsigned (possibly because the conditional just above
>>> guarantees that the result is greater than zero).
>>
>> Do you have more detail about this?  I get worried when I read
>> something like this that sounds like a compiler bug.
>>
>> C99 sayeth:
>>
>> 	When two pointers are subtracted, both shall point to elements
>> 	of the same array object, or one past the last element of the
>> 	array object; the result is the difference of the subscripts
>> 	of the two array elements. The size of the result is
>> 	implementation-defined, and its type (a signed integer type)
>> 	is ptrdiff_t defined in the <stddef.h> header.
>
> I'm not sure if it's a compiler bug or not. I read the bits about
> ptrdiff_t, and it wasn't entirely clear to me if a pointer difference
> _is_ an actual ptrdiff_t, or if it can generally be stored in one. Right
> below that text it also says:
>
>   If the result is not representable in an object of that type, the
>   behavior is undefined.

I can confidentally say the intent in C99 in that passage is to
describe the type of the expression, not just the type of a variable
that can hold it.

> That said, I might be wrong that unsigned promotion is the culprit. I
> didn't look at the generated assembly. But I also can't see what else
> would be causing the problem here. We're clearly returning "-1" and the
> condition doesn't trigger.
>
>> How can I reproduce the problem?
>
> I gave a recipe in the commit message, which is the best I came up with.
> You could probably use a fault-injection library to convince write() to
> fail. Or just tweak the source code to have write_in_full() return -1.

I wonder if a new test helper in t/helper/ would be able to do it (since
then it could e.g. control the filename that write_in_full writes to).

>>> There's no addition to the test suite here, since you need
>>> to convince write() to fail in order to see the problem. The
>>> simplest reproduction recipe I came up with is to trigger
>>> ENOSPC (this only works on Linux, obviously):
>>
>> Does /dev/full make it simpler to reproduce?
>
> I don't think so, because the write() failure is to the lockfile, which
> is created with O_EXCL. So even if you could convince "config.lock" to
> be the right device type, the open() would fail.

Hm, you're convincing me that it would indeed be worth hooking into a
fault injection framework (that e.g. uses LD_PRELOAD), but that's a
topic for another day.

Thanks,
Jonathan