Re: Looking for help to understand external filter driver code

Jeff King <peff@xxxxxxxx> · Wed, 20 Jul 2016 07:49:16 -0600

On Tue, Jul 19, 2016 at 02:33:09PM -0700, Junio C Hamano wrote:

> > Git writes --> 4 byte content length
> > Git writes --> content string
> > Git reads <-- 4 byte filtered content length
> > Git reads <-- filtered content
> 
> Do you really need to force the sender to know the length in
> advance?  Together with the sequential nature of the above exchange,
> i.e. the filter is forbidden from producing even a single byte of
> its output before reading everything Git feeds it, you are making it
> impossible to use filters that perform streaming conversion.

Another option: use pkt-lines with a flush packet to indicate
end-of-input. That allows arbitrary sized data, with streaming, and
reuses existing concepts from git. There is proportional overhead, but
it's only 4 bytes per 64k, which is a tiny percent.

It does make some implementations easier if they know the size ahead of
time, though, so if we are _sure_ that nobody will want streaming later,
it may not be a good tradeoff. If we do print a size ahead of time, the
"normal" thing in git would be to do so in base-10 ascii followed by a
newline (e.g., as found in "cat-file --batch", or fast-import's "data"
command).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html