Re: [PATCH 0/4] rev-list: introduce NUL-delimited output mode

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 10 Mar 2025 13:37:48 -0700

Justin Tobler <jltobler@xxxxxxxxx> writes:

>         ?<oid> [SP <token>=<value>]... LF
>
> where values containing LF or SP are printed in a token specific fashion
> so that the resulting encoded value does not contain either of these two
> problematic bytes. For example, missing object paths are quoted in the C
> style so they contain LF or SP.

"so" -> "when"???

> To make machine parsing easier, this series introduces a NUL-delimited
> output mode for git-rev-list(1) via a `-z` option following a suggestion
> from Junio in a previous thread[1]. In this mode, instead of LF, each
> object is delimited with two NUL bytes and any object metadata is
> separated with a single NUL byte. Examples:
>
>         <oid> NUL NUL
>         <oid> [NUL <path>] NUL NUL

Why do we need double-NUL in the above two cases?

>         ?<oid> [NUL <token>=<value>]... NUL NUL

This one I understand; we could do without double-NUL and take the
lack of "=" in the token after NUL termination as the sign that the
previous record ended, though, to avoid double-NUL while keeping the
format extensible.

As this topic is designing essentially a new and machine parseable
format, we could even unify all three formats into one.  For example,
the format could be like this:

	<oid> NUL [<attr>=<value> NUL]...

where

 (1) A record ends when a new record begins.

 (2) The beginning of a new record is signaled by <oid> that is all
     hexadecimal and does not have any '=' in it.

 (3) The traditional "rev-list --objects" output that gives path in
     addition to the object name uses "path" as the <attr> name,
     i.e. such a record looks like "<oid> NUL path=<path> NUL".

 (4) The traditional "rev-list --missing" output loses the leading
     "?"; it is replaced by "missing" as the <attr> name, i.e. such
     a record may look like "<oid> NUL missing=yes NUL..." together
     with other "<token>=<value> NUL" pairs appended as needed at
     the end.

Hmm?