Re: [PATCH 8/8] fetch: introduce machine-parseable "porcelain" output format

Patrick Steinhardt <ps@xxxxxx> · Thu, 27 Apr 2023 12:58:18 +0200

On Wed, Apr 26, 2023 at 12:52:46PM -0700, Glen Choo wrote:
> Patrick Steinhardt <ps@xxxxxx> writes:
> 
> > The output format is quite simple:
> >
> > ```
> > <flag> <old-object-id> <new-object-id> <local-reference>\n
> > ```
> 
> This format doesn't show the remote name or url that was fetched. That
> seems okay when fetching with a single remote, but it seems necessary
> with "--all". Perhaps you were planning to add that in a later series?
> If so, I think it's okay to call the "porcelain" format experimental,
> and forbid porcelain + --all until then.

The reason is mostly that I didn't find an output format that I really
liked here. We'd basically have to repeat the remote URL for every
single reference: just repeating it once per remote doesn't fly because
with `--parallel` the output could be intermingled. But doing that feels
wasteful to me, so I bailed. I guess I'm also biased here because it
just wouldn't be useful to myself.

So with that in mind, I'd like to continue ignoring this issue for now
and just not report the remote that the ref came from. But I'd also
argue that we don't have to restrict porcelain mode to single-remote
fetches: it can still be useful to do multi-remote fetches even without
the information where a certain reference update comes from. So any kind
of restriction would feel artificial to me here.

Furthermore, I'd argue that it is not necessary to label the format as
experimental only because of this limitation. With the refactorings done
in this and the preceding patch series it is easy to add a new format in
case there indeed is somebody that would have a usecase for this. The
"porcelain" format should stay stable, and if we decide that we want to
also report the remote for each reference in a follow-up we can easily
add a "porcelain-v2" or "porcelain-with-remote" format.

As I said though: I'm clearly biased, so if you feel like my train of
though is simply me being lazy then I'd carve in and adapt.

> > We assume two conditions which are generally true:
> >
> >     - The old and new object IDs have fixed known widths and cannot
> >       contain spaces.
> >
> >     - References cannot contain newlines.
> 
> This seems like a non-issue if we add a -z CLI option to indicate that
> entries should be NUL terminated instead of newline terminated, but that
> can be done as a followup.

Yeah, either via `-z` or a new porcelain output format. But both of
these conditions should generally be true anyway, so I don't see that
those should become a problem.

> > With these assumptions, the output format becomes unambiguously
> > parseable. Furthermore, given that this output is designed to be
> > consumed by scripts, the machine-readable data is printed to stdout
> > instead of stderr like the human-readable output is. This is mostly done
> > so that other data printed to stderr, like error messages or progress
> > meters, don't interfere with the parseable data.
> 
> Sending the 'main output' to stdout makes sense to me, but this (and
> possibly respecting -z) sounds like a different mode of operation, not
> just a matter of formats. It seems different enough that I'd prefer not
> to piggyback on "fetch.output" for this (even though this adds more
> surface to the interface...).
> 
> We could add --porcelain and say that "fetch.output" is ignored if
> --porcelain is also given. That also eliminates the need for
> --output-format, I think.

I was thinking about this initially, as well. But ultimately I decided
against this especially because of your second paragraph: we'd now need
to think about precedence of options and mutual exclusion, and that to
me feels like an interface that is less obvious than a single knob that
works as you'd expect.

> The .c changes look good to me.
> 
> > +test_expect_success 'fetch porcelain output with HEAD and --dry-run' '
> > +	test_when_finished "rm -rf head" &&
> > +	git clone . head &&
> > +	COMMIT_ID=$(git rev-parse HEAD) &&
> > +
> > +	git -C head fetch --output-format=porcelain --dry-run origin HEAD >actual &&
> > +	cat >expect <<-EOF &&
> > +	* $ZERO_OID $COMMIT_ID FETCH_HEAD
> > +	EOF
> > +	test_cmp expect actual &&
> > +
> > +	git -C head fetch --output-format=porcelain --dry-run origin HEAD:foo >actual &&
> > +	cat >expect <<-EOF &&
> > +	* $ZERO_OID $COMMIT_ID refs/heads/foo
> > +	EOF
> > +	test_cmp expect actual
> > +'
> 
> As mentioned upthread, I think this test isn't needed because
> "porcelain" wouldn't run into the bug we are checking for anyway.

The only reason that the other bug was able to survive for so long was
that we didn't have test coverage there. So I think it makes sense to
explicitly test this, too, also because it causes us to walk a different
code path.

Last but not least: this test uncovered a segfault I had in a previous
version. So I'd rather keep it :)

Patrick
Attachment:
signature.asc

Description: PGP signature