Re: [RFC/PATCH 2/3] add a library of code for producing structured output

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Julian Phillips <julian@xxxxxxxxxxxxxxxxx> writes:

> Add a library that allows commands to produce structured output in any
> of a range of formats using a single API.
>
> The API includes an OPT_OUTPUT and handle_output_arg so that the
> option handling for different commands will be as similar as possible.

I was hoping that the existing low-level -z routines (e.g. "diff-* -z")
follow similar enough patterns to have a corresponding output-z.c and be
handled inside output.c library.  But that is not a requirement, just
"would have been nicer if the original were written that way".

> diff --git a/output-json.c b/output-json.c
> new file mode 100644
> index 0000000..0eb66b2
> --- /dev/null
> +++ b/output-json.c
> @@ -0,0 +1,128 @@
> +#include "git-compat-util.h"
> +#include "output.h"
> +#include "strbuf.h"
> +
> +static char *json_quote(char *s)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +
> +	while (*s) {
> +		switch (*s) {
> +...
> +		default:
> +			/* All control characters must be encode, even if they
> +			 * don't have a specific escape character of their own */
> +			if (*s < 0x20)
> +				strbuf_addf(&buf, "\\u%04x", *s);

As you didn't say your "char" is either signed or unsigned upfront, this
will behave differently when you are fed a UTF-8 string.  If it is signed,
you will end up showing bytes in a single letter separately at wrong
codepoint, and if it is unsigned, you will give UTF-8 string unquoted,
which probably is what you meant to do.

What is your design intention regarding legacy encoding?  This code does
not yet declare "dear user, if you plan to use json/xml output, your
repository metadata (notably the pathnames) has to be in UTF-8", as the
caller _could_ transliterate legacy data before feeding it to output.c
layer.  An alternative would be for the output.c layer to know about the
encoding of incoming data and transliterate when the output format
requires a particular encoding.

> +static void json_obj_item_start(FILE *file, char *name, int first)
> +{
> +	char *quoted = json_quote(name);
> +	if (!first)
> +		fprintf(file, ",\n");
> +	fprintf(file, "\"%s\" : ", quoted);
> +	free(quoted);
> +}
> + ...
> +static void json_str(FILE *file, char *value)
> +{
> +	char *quoted = json_quote(value);
> +	fprintf(file, "\"%s\"", quoted);
> +	free(quoted);
> +}

An obvious improvement would be to make json_quote() to take FILE * to
avoid wasteful allocation and copy, as it doesn't do anything but addstr
and addch, and all of its callers don't do anything but spitting the
result out to FILE *.

> diff --git a/output-xml.c b/output-xml.c
> new file mode 100644
> index 0000000..50dd7d6
> --- /dev/null
> +++ b/output-xml.c
> @@ -0,0 +1,68 @@
> +#include "git-compat-util.h"
> +#include "output.h"

This seems to totally lack quoting of any metacharacters for "name" and
string "value".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]