Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock

Stefan Beller <sbeller@xxxxxxxxxx> · Thu, 17 Sep 2015 10:26:19 -0700

On Thu, Sep 17, 2015 at 10:13 AM, Jeff King <peff@xxxxxxxx> wrote:
> On Thu, Sep 17, 2015 at 09:58:00AM -0700, Junio C Hamano wrote:
>
>> > Certainly anybody who does not realize their descriptor is O_NONBLOCK
>> > and is using the spinning for correctness. I tend to think that such
>> > sites are wrong, though, and would benefit from us realizing they are
>> > spinning.
>>
>> With or without O_NONBLOCK, not looping around xread() _and_ relying
>> on the spinning for "correctness" means the caller is not getting
>> correctness anyway, I think, because xread() does return a short
>> read, and we deliberately and explicitly do so since 0b6806b9
>> (xread, xwrite: limit size of IO to 8MB, 2013-08-20).
>
> I think they have to loop for correctness, but they may do this:
>
>   if (xread(fd, buf, len) < 0)
>         die_errno("OMG, an error!");
>
> which is not correct if "fd" is unknowingly non-blocking. As Stefan
> mentioned, we do not set O_NONBLOCK ourselves very much, but I wonder if
> we could inherit it from the environment in some cases.
>
> The spinning behavior is not great, but does mean that we spin and
> continue rather than bailing with an error.
>
>> > But I think you can't quite get away with leaving strbuf_read untouched
>> > in this case. On error, it wants to restore the original value of the
>> > strbuf before the strbuf_read call. Which means that we throw away
>> > anything read into the strbuf before we get EAGAIN, and the caller never
>> > gets to see it.
>>
>> I agree we need to teach strbuf_read() that xread() is now nicer on
>> O_NONBLOCK; perhaps like this?
>>
>>  strbuf.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/strbuf.c b/strbuf.c
>> index cce5eed..49104d7 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -368,6 +368,8 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>>
>>               cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
>>               if (cnt < 0) {
>> +                     if (errno == EAGAIN || errno == EWOULDBLOCK)
>> +                             break;
>>                       if (oldalloc == 0)
>>                               strbuf_release(sb);
>>                       else
>
> If we get EAGAIN on the first read, this will return "0", and I think we
> end up in the "was it EOF, or EAGAIN?" situation I mentioned earlier.
> If we reset errno to "0" at the top of the function, we could get around
> one problem, but it still makes an annoying interface: the caller has to
> check errno for any 0-return to figure out if it was really EOF, or just
> EAGAIN.
>
> If we return -1, though, we have a similar annoyance. If the caller
> notices a -1 return value and finds EAGAIN, they still may need to check
> sb->len to see if they made forward progress and have data they should
> be dealing with.

If errno == EAGAIN, we know it is a non blocking fd, so we could encode
the length read as (- 2 - length), such that

...-2 the length read if EAGAIN was ignored
-1 for error, check errno!
0 for EOF
+1... length if we just read it or restarted it due to EINTR.

The call site should know if it is non blocking (i.e. if the <-2 case can
happen) and handle it appropriately.

Any callsite of today which is unaware of the fd being non blocking would break
by this (as we want to fix the spinning anyway eventually)

>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html