Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 22 2022, Taylor Blau wrote:

[I think John Cai would appreciate being CC'd on this]

> It's tempting to think that we could use `strbuf_getwholeline()` and
> specify either `\n` or `\0` as the terminating character. But for input
> on platforms that include a CR character preceeding the LF, this
> wouldn't quite be the same, since `strbuf_getline(...)` will trim any
> trailing CR, while `strbuf_getwholeline(&buf, stdin, '\n')` will not.

I commend the effort to maintain bug-for-bug compatibility, but do we
care about this distinction, or is it just an accident that we support
\r\n here in the first place?

This doesn't apply to the rest of cat-file directly, but the origin of
the recent --batch-command mode cdoe was to lift the same-ish code from
builtin/update-ref.c, whose \n or \0 mode does exactly that:

	while (!strbuf_getwholeline(&input, stdin, line_termination)) {

I.e. it doesn't support \r\n, just \n or \0.

Isn't that fine? I may be missing something, but why isn't it OK even on
platforms that use \r\n for their normal *.txt endings to only use \n or
\0 for this bit of protocol?

For the command mode at least this passes our tests:
	
	diff --git a/builtin/cat-file.c b/builtin/cat-file.c
	index f42782e955f..8646059472d 100644
	--- a/builtin/cat-file.c
	+++ b/builtin/cat-file.c
	@@ -614,12 +614,16 @@ static void batch_objects_command(struct batch_options *opt,
	 	struct queued_cmd *queued_cmd = NULL;
	 	size_t alloc = 0, nr = 0;
	 
	-	while (!strbuf_getline(&input, stdin)) {
	+	while (!strbuf_getwholeline(&input, stdin, '\n')) {
	 		int i;
	 		const struct parse_cmd *cmd = NULL;
	 		const char *p = NULL, *cmd_end;
	 		struct queued_cmd call = {0};
	 
	+		if (input.len > 0 && input.buf[input.len - 1] == '\n')
	+			--input.len;
	+		input.buf[input.len] = '\0';
	+
	 		if (!input.len)
	 			die(_("empty command in input"));
	 		if (isspace(*input.buf))

So maybe we should just do something like that instead? I.e. declare
that a mistake.

As for the rest of cat-file 05d5667fec9 (git-cat-file: Add --batch-check
option, 2008-04-23) documents that it's LF, not CR LF, ditto
git-cat-file.txt.

So isn't this just an accident in of us having used the strbuf_getline()
function to mean "\n", but actually it also does "\r\n".

Which is a really unfortunately named function b.t.w., since it sneaks
this bit of Windows portability into places that may not want it in the
first place.

>  strlen () {
>      echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
>  }
> @@ -398,6 +403,12 @@ test_expect_success '--batch with multiple sha1s gives correct format' '
>  	test "$(maybe_remove_timestamp "$batch_output" 1)" = "$(maybe_remove_timestamp "$(echo_without_newline "$batch_input" | git cat-file --batch)" 1)"
>  '
>  
> +test_expect_success '--batch, -z with multiple sha1s gives correct format' '
> +	echo_without_newline_nul "$batch_input" >in &&
> +	test "$(maybe_remove_timestamp "$batch_output" 1)" = \
> +	"$(maybe_remove_timestamp "$(git cat-file --batch -z <in)" 1)"

This...

> +'
> +
>  batch_check_input="$hello_sha1
>  $tree_sha1
>  $commit_sha1
> @@ -418,6 +429,24 @@ test_expect_success "--batch-check with multiple sha1s gives correct format" '
>      "$(echo_without_newline "$batch_check_input" | git cat-file --batch-check)"
>  '
>  
> +test_expect_success "--batch-check, -z with multiple sha1s gives correct format" '
> +    echo_without_newline_nul "$batch_check_input" >in &&
> +    test "$batch_check_output" = "$(git cat-file --batch-check -z <in)"

This....

> +'
> +
> +test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' '
> +	touch -- "newline${LF}embedded" &&
> +	git add -- "newline${LF}embedded" &&
> +	git commit -m "file with newline embedded" &&
> +	test_tick &&
> +
> +	printf "HEAD:newline${LF}embedded" >in &&
> +	git cat-file --batch-check -z <in >actual &&
> +
> +	echo "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect &&

..and this hides git's exit code, better to pipe to a file, use test_cmp
etc. etc.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux