RE: [BUG] 2.44.0 t7704.9 Fails on NonStop ia64

<rsbecker@xxxxxxxxxxxxx> · Mon, 26 Feb 2024 13:03:21 -0500

>From: Phillip Wood <phillip.wood123@xxxxxxxxx>
On Monday, February 26, 2024 11:00 AM, Phillip Wood wrote:
>To: rsbecker@xxxxxxxxxxxxx; 'Torsten Bögershausen' <tboegi@xxxxxx>
>Cc: git@xxxxxxxxxxxxxxx; Patrick Steinhardt <ps@xxxxxx>
>Subject: Re: [BUG] 2.44.0 t7704.9 Fails on NonStop ia64
>
>On 26/02/2024 15:32, Phillip Wood wrote:
>> Hi Randal
>>
>> [cc'ing Patrick for the reftable writer]
>>
>> On 25/02/2024 20:36, rsbecker@xxxxxxxxxxxxx wrote:
>>> On Sunday, February 25, 2024 2:20 PM, Torsten Bögershausen wrote:
>>>> On Sun, Feb 25, 2024 at 02:08:35PM -0500, rsbecker@xxxxxxxxxxxxx wrote:
>>>>> On Sunday, February 25, 2024 1:45 PM, I wrote:
>>>>>> To: git@xxxxxxxxxxxxxxx
>>>> But I think that this should be used:
>>>> write_in_full()
>>>
>>> My mailer autocorrected, yes, xwrite. write_in_full() would be safe,
>>> although a bit redundant since xwrite() does similar things and is
>>> used by write_in_full().
>>
>> Note that unlike write_in_full(), xwrite() does not guarantee to write
>> the whole buffer passed to it. In general unless a caller is writing a
>> single byte or writing less than PIPE_BUF bytes to a pipe it should
>> use write_in_full().
>>
>>> The question is which call is bad? The cruft stuff is relatively new
>>> and I don't know the code.
>
>I should have been clearer that I do not think any of these calls are the likely
>problem for the cruft pack code. I do think the reftable writers are worth looking at
>though for git in general.
>
>For the cruft pack problem you might want to look for suspect xwrite() calls where
>the caller does not handle a short write correctly for example under builtin/ we have
>
>builtin/index-pack.c:                   err = xwrite(1, input_buffer +
>input_offset, input_len);
>builtin/receive-pack.c:         xwrite(2, msg, sz);
>builtin/repack.c:       xwrite(cmd->in, oid_to_hex(oid),
>the_hash_algo->hexsz);
>builtin/repack.c:       xwrite(cmd->in, "\n", 1);
>builtin/unpack-objects.c:               int ret = xwrite(1, buffer +
>offset, len);
>
>Best Wishes
>
>Phillip
>
>>>>> reftable/writer.c:              int n = w->write(w->write_arg,
>>>>> zeroed,
>>>>> w->pending_padding);
>>>>> reftable/writer.c:      n = w->write(w->write_arg, data, len);
>>
>> Neither of these appear to check for short writes and
>> reftable_fd_write() is a thin wrapper around write(). Maybe
>> reftable_fd_write() should be using write_in_full()?
>>
>>>>> run-command.c:                  len = write(io->fd, io->u.out.buf,
>>
>> This call to write() looks correct as it is in the io pump loop.
>>
>>>>> t/helper/test-path-utils.c:                     if (write(1,
>>>>> buffer,
>>> count)
>>>>> < 0) >>> t/helper/test-windows-named-pipe.c:             write(1,
>>>>> buf, nbr);
>>>>> t/helper/test-windows-named-pipe.c:             write(1, buf, nbr);
>>
>> In principle these all look like they are prone to short writes.
>>
>>>>> trace2/tr2_dst.c:       bytes = write(fd, buf_line->buf,
>>>>> buf_line->len);
>>
>> This caller explicitly says it prefers short writes over retrying

I'm getting caught a bit behind the curve. After rebuilding from master, I'm now getting:

+ test 1708960150 -lt 1708970156
+ test_line_count = 3 cruft.before
+ test_line_count = 2 cruft.after
test_line_count: line count for cruft.after != 2

This is looking more like a different problem than xwrite().