Usman Akinyemi <usmanakinyemi202@xxxxxxxxx> writes: > As some issues that can happen with a Git client can be operating system > specific, it can be useful for a server to know which OS a client is > using. In the same way it can be useful for a client to know which OS > a server is using. Hmph. The other end may be running different version of Git, and the version difference of _our_ software is probably more relevant. For that matter, they may even be running something entirely different from our software, like Gerrit. So I am not sure I am convinced that os-version thing is a good thing to have with that paragraph. > Let's introduce a new protocol (`os-version`) allowing Git clients and > servers to exchange operating system information. The protocol is > controlled by the new `transfer.advertiseOSVersion` config option. The last sentence is redundant and can safely removed. The next paragraph describes it better than "is controlled by". > Add the `transfer.advertiseOSVersion` config option to address > privacy concerns. It defaults to `true` and can be changed to > `false`. When enabled, this option makes clients and servers send each > other the OS name (e.g., "Linux" or "Windows"). The information is > retrieved using the 'sysname' field of the `uname(2)` system call. Add "or its equivalent" at the end. macOS may have one, but it probably is not quite correct to say that Windows have uname system call (otherwise we wouldn't be emulating it on top of GetVersion ourselves). > However, there are differences between `uname(1)` (command-line utility) > and `uname(2)` (system call) outputs on Windows. These discrepancies > complicate testing on Windows platforms. For example: > - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\ > .2024-02-14.20:17.UTC.x86_64 > - `uname(2)` output: Windows.10.0.20348 > > On Windows, uname(2) is not actually system-supplied but is instead > already faked up by Git itself. We could have overcome the test issue > on Windows by implementing a new `uname` subcommand in `test-tool` > using uname(2), but except uname(2), which would be tested against > itself, there would be nothing platform specific, so it's just simpler > to disable the tests on Windows. OK. > +transfer.advertiseOSVersion:: > + When `true`, the `os-version` capability is advertised by clients and > + servers. It makes clients and servers send to each other a string > + representing the operating system name, like "Linux" or "Windows". > + This string is retrieved from the `sysname` field of the struct returned > + by the uname(2) system call. Defaults to true. Presumably, both ends of the connection independently choose whether they enable or disable this variable, so we have 2x2=4 combinations (here, versions of Git before the os-version capability support is introduced behave the same way as an installation with this configuration variable set to false). And among these four combinations, only one of them results in "send to each other", but the description above is fuzzy. > diff --git a/connect.c b/connect.c > index 10fad43e98..6d5792b63c 100644 > --- a/connect.c > +++ b/connect.c > @@ -492,6 +492,9 @@ static void send_capabilities(int fd_out, struct packet_reader *reader) > if (server_supports_v2("agent")) > packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized()); > > + if (server_supports_v2("os-version") && advertise_os_version(the_repository)) > + packet_write_fmt(fd_out, "os-version=%s", os_version_sanitized()); Not a new problem, because the new code is pretty-much a straight copy from the existing "agent" code, but do we ever use unsanitized versions of git-user-agent and os-version? If not, I am wondering if we should sanitize immediately when we obtain the raw string and keep it, get rid of _santized() function from the public API, and make anybody calling git_user_agent() and os_version() to get sanitized safe-to-use strings. I see http.c throws git_user_agent() without doing any sanitization at the cURL library, but it may be a mistake that we may want to fix (outside the scope of this topic). Since the contrast between the os_version() vs the os_version_sanitized() is *new* in this series, however, we probably would want to get it right from the beginning. So the question is again, do we ever need to use os_version() that is a raw string that may require sanitizing? I do not think of any offhand.