Re: [PATCH v5 6/6] cat-file: add remote-object-info to batch-command

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Thu, 11 Aug 2022 12:58:52 +0200

On Mon, Aug 08 2022, Calvin Wan wrote:

>> > Since the `info` command in cat-file --batch-command prints object info
>> > for a given object, it is natural to add another command in cat-file
>> > --batch-command to print object info for a given object from a remote.
>>
>> Is it ?:)
>
> Haha yes this could use a little rewording
>
>> > Add `remote-object-info` to cat-file --batch-command.
>>
>> I realize this bit of implementation changed in v4, i.e. it used to be
>> in "fetch", and I'm happy to have it moved out of there, we don't need
>> to overload it more.
>>
>> But I remember thinking (and perhaps commenting on-list, I can't
>> remember) that the "object-info" server verb was a bit odd at the time
>> that it was implemented. I understand the motivation, but surely it was
>> stumbling its way towards being something more generic, i.e. being able
>> to just expose cmd_cat_file() in some form.
>>
>> Which is one of the goals I've had in mind with working on fixing memory
>> leaks in various places, i.e. once you get common commands to clean up
>> after themselves it usually becomes to have a "command server".
>>
>> So (and I don't mind if this is longer term, just asking), is there a
>> reason for why we wouldn't want to do away with object-info and this
>> "cat-file talks to a remote", in favor of just having support for
>> invoking arbitrary commands from a remote.
>>
>> Of course that set of allowed RCE commands would be zero by default, but
>> if we had some way to define tha "cat-file" was allowed to be called,
>> and only if you invoked:
>>
>>         cat-file --batch="%(objectsize)"
>>
>> Or whatever, but over the v2 protocol, wouldn't we basically have
>> object-info in a more roundabout way?
>
> While I do think that if we did have a set of allowed RCE commands, this
> would be a good candidate to be one of those commands. I am worried
> about security, maintainability, and server performance risks this change
> would also carry. Figuring out which commands are secure and would
> not overload the server, and then maintaining that set seems like a much
> more worrisome design than having a secure git server.

I'm only suggesting that the interface be extendable enough to allow for
that, not that we do it right now, i.e. that the protocol command could
be:

	# With appropriate \0 delimiting etc.
	cmd cat-file -e <SHA1>

As opposed to something like:

	object-exists <SHA1>

But that we would *not* for now expose some mechanism where the server
operator could configure which arbitrary commands to allow, we'd just
intercept that specifically, and dispatch it to the appropriate code in
builtin/cat-file.c (or libify them enough to expose them, and then call
that).

Of course one of the points of doing so would be to eventually expose
the ability to run a larger set of safe commands remotely, if the server
operator agrees, but we'd leave that for the future.

Right now we'd get the benefit of not duplicating various parts of
existing code, and having a plain one-to-one mapping to existing
commands.