Re: [PATCH v7 3/4] notes.c: introduce '--separator=<paragraph-break>' option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Teng Long <dyroneteng@xxxxxxxxx> writes:

> From: Teng Long <dyroneteng@xxxxxxxxx>
>
> When adding new notes or appending to an existing notes, we will
> insert a blank line between the paragraphs, like:
>
>      $ git notes add -m foo -m bar
>      $ git notes show HEAD | cat
>      foo
>
>      bar
>
> The default behavour sometimes is not enough, the user may want
> to use a custom delimiter between paragraphs, like when
> specifiy '-m', '-F', '-C', '-c' options. So this commit

"like when specifying", you mean?

> introduce a new '--separator' option for 'git notes add' and
> 'git notes append', for example when execute:

"when executing"?

>     $ git notes add -m foo -m bar --separator="-"
>     $ git notes show HEAD | cat
>     foo
>     -
>     bar
>
> A newline is added to the value given to --separator if it

"a newline is ...", because this is not starting a new sentence, but
continuing from the "for example, when ..." above.

> does not end with one already. So execute:
>
>       $ git notes add -m foo -m bar --separator="-"
> and
>       $ export LF="
>       "
>       $ git notes add -m foo -m bar --separator="-$LF"
>
> Running A and B produces the same result.

Here, it is totally unclear what A and B refers to.  "both
... and ... produce the same result" or something?

> @@ -85,8 +87,12 @@ corresponding <to-object>.  (The optional `<rest>` is ignored so that
>  the command can read the input given to the `post-rewrite` hook.)
>  
>  append::
> -	Append to the notes of an existing object (defaults to HEAD).
> -	Creates a new notes object if needed.
> +	Append new message(s) given by `-m` or `-F` options to an
> +	existing note, or add them as a new note if one does not
> +	exist, for the object (defaults to HEAD).  When appending to
> +	an existing note, a blank line is added before each new
> +	message as an inter-paragraph separator.  The separator can
> +	be customized with the `--separator` option.

Excellent.

> @@ -101,6 +102,9 @@ struct note_data {
>  	int use_editor;
>  	char *edit_path;
>  	struct strbuf buf;
> +	struct strbuf **messages;
> +	size_t msg_nr;
> +	size_t msg_alloc;
>  };

Hmph, OK.  I'd think twice before allowing an array of strbufs,
though.  strbuf is an excellent data structure to allow editing
string safely, and an array of strbufs may be very useful when these
strings need to be edited simultaneously, but such a use case is
very rare and I suspect this would not be it---rather, don't we take
one string at a time while processing each option, perhaps running
stripspace, and then after that aren't we done with the string until
when we finally have these N strings to loop over and concatenate
into a single string?  It would be sensible to use a strbuf to
produce the concatenation, but the strings to be concatenated do not
have to come from strbufs.  So my intuition, before reading the code
but after seeing the data structure, says that "const char **messages"
might be more appropriate here.  It sends a strong message about the
code structure and "phases" of processing, i.e. "we read each piece
and store it in the array once we are done preprocessing; then in
the next phase we concatenate them".

> @@ -209,71 +213,110 @@ static void write_note_data(struct note_data *d, struct object_id *oid)
>  	}
>  }
>  
> +static void insert_separator(struct strbuf *message, size_t pos)
> +{
> +	if (!separator)
> +		strbuf_insertstr(message, pos, "\n");
> +	else if (separator[strlen(separator) - 1] == '\n')
> +		strbuf_insertstr(message, pos, separator);
> +	else
> +		strbuf_insertf(message, pos, "%s%s", separator, "\n");
> +}

Do we need "insert" separator?  The caller in "concat" certainly
shouldn't---all it needs is "append".

> +/* Consume messages and append them into d->buf, then free them */
> +static void concat_messages(struct note_data *d)
> +{
> +	struct strbuf msg = STRBUF_INIT;
> +
> +	size_t i;
> +	for (i = 0; i < d->msg_nr ; i++) {
> +		if (d->buf.len)
> +			insert_separator(&d->buf, d->buf.len);
> +		strbuf_add(&msg, d->messages[i]->buf, d->messages[i]->len);
> +		strbuf_addbuf(&d->buf, &msg);
> +		strbuf_reset(&msg);
> +		strbuf_release(d->messages[i]);
> +		free(d->messages[i]);
> +	}
> +	strbuf_release(&msg);
> +	free(d->messages);
> +}

As I suspected earlier, with the way d->messages[i] are used, they
have no reason to be strbufs---they can and should be a simple
string "const char *".

>  static int parse_msg_arg(const struct option *opt, const char *arg, int unset)
>  {
>  	struct note_data *d = opt->value;
> +	struct strbuf *buf;
>  
>  	BUG_ON_OPT_NEG(unset);
>  
> -	if (d->buf.len)
> -		strbuf_addch(&d->buf, '\n');
> -	strbuf_addstr(&d->buf, arg);
> -	strbuf_stripspace(&d->buf, 0);
> +	buf = xmalloc(sizeof(*buf));
> +	strbuf_init(buf, strlen(arg));
> +	strbuf_addstr(buf, arg);
> +	strbuf_stripspace(buf, 0);
>  
> -	d->given = 1;
> +	ALLOC_GROW_BY(d->messages, d->msg_nr, 1, d->msg_alloc);
> +	d->messages[d->msg_nr - 1] = buf;
>  	return 0;
>  }

And this one can use an on-stack strbuf (initialized with STRBUF_INIT),
and strbuf_detach() the result into d->messages[].

>  static int parse_file_arg(const struct option *opt, const char *arg, int unset)
>  {
>  	struct note_data *d = opt->value;
> +	struct strbuf *buf;

Likewise.

>  
>  	BUG_ON_OPT_NEG(unset);
>  
> -	if (d->buf.len)
> -		strbuf_addch(&d->buf, '\n');
> +	buf = xmalloc(sizeof(*buf));
> +	strbuf_init(buf, 0);
> +
>  	if (!strcmp(arg, "-")) {
> -		if (strbuf_read(&d->buf, 0, 1024) < 0)
> +		if (strbuf_read(buf, 0, 1024) < 0)
>  			die_errno(_("cannot read '%s'"), arg);
> -	} else if (strbuf_read_file(&d->buf, arg, 1024) < 0)
> +	} else if (strbuf_read_file(buf, arg, 1024) < 0)
>  		die_errno(_("could not open or read '%s'"), arg);
> -	strbuf_stripspace(&d->buf, 0);
> +	strbuf_stripspace(buf, 0);
>  
> -	d->given = 1;
> +	// we will note stripspace the buf here, because the file maybe
> +	// is a binary and it maybe contains multiple continuous line breaks.

No // comments in our codebase.

> +	ALLOC_GROW_BY(d->messages, d->msg_nr, 1, d->msg_alloc);
> +	d->messages[d->msg_nr - 1] = buf;
>  	return 0;
>  }

>  static int parse_reuse_arg(const struct option *opt, const char *arg, int unset)
>  {
> ...
>  }

Ditto.

> @@ -567,7 +617,8 @@ static int append_edit(int argc, const char **argv, const char *prefix)
>  	const struct object_id *note;
>  	char *logmsg;
>  	const char * const *usage;
> -	struct note_data d = { .buf = STRBUF_INIT };
> +	struct note_data d = { .buf = STRBUF_INIT, .messages = NULL };

Why explicitly initialize .messages to NULL when we are leaving
other members to zero initialized implicitly by using designated
initializers?

> @@ -618,7 +675,7 @@ static int append_edit(int argc, const char **argv, const char *prefix)
>  		char *prev_buf = read_object_file(note, &type, &size);
>  
>  		if (d.buf.len && prev_buf && size)
> -			strbuf_insertstr(&d.buf, 0, "\n");
> +			insert_separator(&d.buf, 0);
>  		if (prev_buf && size)
>  			strbuf_insert(&d.buf, 0, prev_buf, size);
>  		free(prev_buf);

Having to insert two things in front of d.buf feels somewhat
wasteful.  I wonder if we can easily reorder the logic to first read
the previous one, and then append new stuff to it, perhaps?

It wouldn't be a huge deal.  If this weren't the only reason why we
need to invent insertstr that takes "where", I wouldn't even be
mentioning it.

> diff --git a/t/t3301-notes.sh b/t/t3301-notes.sh
> index 3288aaec..716192b5 100755
> --- a/t/t3301-notes.sh
> +++ b/t/t3301-notes.sh
> @@ -5,6 +5,7 @@
>  
>  test_description='Test commit notes'
>  
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh

This is a strange commit to add this.  If the test and the code
involved in the test were leak-sanitizer clean before this commit,
then you should have been able to insert this without any other
change, and we should do it that way in a separate commit.  If we
are fixing an existing leak that the sanitizer complains, then that
is a different matter.  If that is the case, it makes perfect sense
to have this here, but the proposed commit log message should
explain that it is what is happening.

Other than that, looks very nicely done.

Thanks.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux