In-tree strbuf "in-place" search/replace (was: [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote:

> @@ -634,17 +634,46 @@ static void path_msg(struct merge_options *opt,
>  		     const char *fmt, ...)
>  {
>  	va_list ap;
> -	struct strbuf *sb = strmap_get(&opt->priv->output, path);
> +	struct strbuf *sb, *dest;
> +	struct strbuf tmp = STRBUF_INIT;
> +
> +	if (opt->record_conflict_msgs_as_headers && omittable_hint)
> +		return; /* Do not record mere hints in tree */
> +	sb = strmap_get(&opt->priv->output, path);
>  	if (!sb) {
>  		sb = xmalloc(sizeof(*sb));
>  		strbuf_init(sb, 0);
>  		strmap_put(&opt->priv->output, path, sb);
>  	}
>  
> +	dest = (opt->record_conflict_msgs_as_headers ? &tmp : sb);
> +
>  	va_start(ap, fmt);
> -	strbuf_vaddf(sb, fmt, ap);
> +	strbuf_vaddf(dest, fmt, ap);
>  	va_end(ap);
>  
> +	if (opt->record_conflict_msgs_as_headers) {
> +		int i_sb = 0, i_tmp = 0;
> +
> +		/* Copy tmp to sb, adding spaces after newlines */
> +		strbuf_grow(sb, 2*tmp.len); /* more than sufficient */
> +		for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
> +			/* Copy next character from tmp to sb */
> +			sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
> +
> +			/* If we copied a newline, add a space */
> +			if (tmp.buf[i_tmp] == '\n')
> +				sb->buf[++i_sb] = ' ';
> +		}
> +		/* Update length and ensure it's NUL-terminated */
> +		sb->len += i_sb;
> +		sb->buf[sb->len] = '\0';
> +
> +		/* Clean up tmp */
> +		strbuf_release(&tmp);
> +	}
> +
> +	/* Add final newline character to sb */
>  	strbuf_addch(sb, '\n');
>  }
>  

I'm not saying this is wrong or needs to change. Just a reader's note
that this sent me on an interesting journey of looking at various
in-tree callers of strbufs that want to do the equivalent of
s/$from/$to/ on a strbuf, with and without the equivalent of /g.

I figured I'd change the $subject since this is more of a general
musing...

In trailer.c we've got strbuf_replace(), which looks like it could be
made to be general enough to serve most callers if it did a memmem()
instead of a strstr(), and knew to take a "all" flag to implement a /g.

We then have e.g. lf_to_crlf() in imap-send.c, which uses a newly
alloc'd buffer followed by a strbuf_attach(), which is a common pattern.

Then strbuf_reencode() in strbuf.c basically solves this problem, and
calls reencode_string_len(), both it and the underlying function are
*almost* general enough to know to take some "from/to" string/length
pair, i.e. to not be bound to "reencoding" only with iconv().

Then there's strbuf_add_percentencode() and strbuf_add_urlencode() whose
API users might be happy with in-place replacing, but do a
read-and-copy-maybe-expand.

It might be an interesting follow-up project for someone to come up with
a generic in-place search-replace function with a signature like:

	int strbuf_replace(struct strbuf *sb, const char *from,
			   size_t from_len, const char *to,
			   size_t to_len, int max);

To do e.g. in this case:

	if (opt->record_conflict_msgs_as_headers)
		strbuf_replace(sb, "\n", strlen("\n"), "\n ", strlen("\n "), -1);

The various in-tree implementations do some variant of over-mallocing to
save work in the loop (as is being done here), copying where a small
realloc/memmove might do, scanning the string to figure out how much to
malloc, then copying in a second pass etc.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux