On 3/25/22 8:48 PM, Ævar Arnfjörð Bjarmason wrote:
On Fri, Mar 25 2022, Jeff Hostetler wrote:
On 3/25/22 3:02 PM, rsbecker@xxxxxxxxxxxxx wrote:
On March 25, 2022 2:03 PM, Jeff Hostetler wrote:
[...]
[...]
Wouldn't it be much simpler POC in this case to write "watchman
backend"? Then we'd both get a Linux backend, and an alternate backend
for the other platforms to validate their implementation.
Some past references to that:
https://lore.kernel.org/git/871r8c73ej.fsf@xxxxxxxxxxxxxxxxxxx/ &
https://lore.kernel.org/git/87h7lgfchm.fsf@xxxxxxxxxxxxxxxxxxx/
Yes, there are several ways for a client command, such as anyone
who calls read_index/refresh_index, to get FS change data from a
monitoring service.
Let's go thru the options here for the sake of conversation:
(option 1): Use the hook-like mechanism that Ben built in 2017
to talk to an interlude program, shell script, perl
script, etc. That "script" itself then talks to a
long-running service/daemon, such as Watchman, to get
the list of changes and relays them back to the client.
* This "proxy" has to handle protocol format conversions.
* It may also have to start the service on new repos.
* And depends upon a third-party service being installed.
* We are limited to supporting platforms where the third-party
tool is supported.
(option 2): Replace the hook with builtin client code to talk
directly to the service and bypass the need for
the proxy script/executable.
* Git client code would need client-side IPC to talk to
an established and running service. (Similar to the client
side of Simple-IPC but probably not pkt-line based.)
* Git client code would now need to handle any protocol
format conversions.
* Git client code might also have to start the service.
* And we'd still be dependent on a third-party service being
installed.
* And we are still limited to supporting platforms where
the third-party tool is supported.
* So far we've been assuming that that third-party tool is
"Watchman", but technically, you could have other such
services available.
* So you may need multiple implementations of option 2,
one for each third-party tool.
* I'm not saying that this is hard, but just yet another
detail that would have to be encoded in the Git source
to get this "free" feature.
(option 3): Git implements a daemon to monitor the file system
directly.
* Git owns the protocol between client and service.
* Git owns the backend, so no third-party tools required.
* Git owns service startup.
* Unfortunately, we are also responsible for building the
backends on each platform we want to support.
* In the future, we could augment the service to be more
"Git-aware", such as discarding data for ignored files,
but that is just speculation at this point.
Now, with that context in place:
[1] Nothing prohibits us from having all three options be available
on a platform. They should all be able to coexist.
[2] One of my stated goals was to reduce the dependency on
third-party tools -- especially on platforms that don't have
a simple package management system. The point here was to
make it easier for enterprises to deploy Git to 1k's or 10k's
of users (and possibly unattended build machines) and make use
of the feature without *also* having to deploy and track updates
to yet-another third-party tool or otherwise complicate their ES
deployment setups. Only option 3 gets rid of the third-party
tool requirement.
[3] Option 2 is a valuable suggestion, don't get me wrong. It can/
will/should improve performance over option 1 by eliminating an
extra process creation and the overhead of pumping all of that
data thru another socket-pair/process and all of the context
switches that that requires.
[4] Option 2 and option 3 could/should perform relatively equally.
And if we wanted to deprecate the hook-like interface, doing
an option 2 implementation would allow us to transition the
platforms for which I don't currently have a backend.
[5] However, option 2 does not eliminate the need for a third-party
tool, so it is of limited interest to me at this time. Yes, it
would be nice to have it for testing and perf testing purposes
and comparisons with option 3, but if I have to budget my time,
I would rather spend my efforts on additional backends.
I consider the question of doing option 2 and a Linux backend
as two completely independent topics -- topics that we can
discuss and/or pursue in parallel if there is interest.
[6] Randall's question was about doing option 3 and I hope that I
provided helpful information should he or anyone else want to
pick up that effort before I can.
[7] If you want to start a parallel conversation on option 2, let's
do that in a new top-level email thread.
Cheers,
Jeff