On January 17, 2025 5:22 PM, Junio C Hamano wrote: >Usman Akinyemi <usmanakinyemi202@xxxxxxxxx> writes: > >> As some issues that can happen with a Git client can be operating >> system specific, it can be useful for a server to know which OS a >> client is using. In the same way it can be useful for a client to know >> which OS a server is using. > >Hmph. The other end may be running different version of Git, and the version >difference of _our_ software is probably more relevant. >For that matter, they may even be running something entirely different from our >software, like Gerrit. So I am not sure I am convinced that os-version thing is a good >thing to have with that paragraph. > >> Let's introduce a new protocol (`os-version`) allowing Git clients and >> servers to exchange operating system information. The protocol is >> controlled by the new `transfer.advertiseOSVersion` config option. > >The last sentence is redundant and can safely removed. The next paragraph >describes it better than "is controlled by". > >> Add the `transfer.advertiseOSVersion` config option to address privacy >> concerns. It defaults to `true` and can be changed to `false`. When >> enabled, this option makes clients and servers send each other the OS >> name (e.g., "Linux" or "Windows"). The information is retrieved using >> the 'sysname' field of the `uname(2)` system call. > >Add "or its equivalent" at the end. >macOS may have one, but it probably is not quite correct to say that Windows have >uname system call (otherwise we wouldn't be emulating it on top of GetVersion >ourselves). > >> However, there are differences between `uname(1)` (command-line >> utility) and `uname(2)` (system call) outputs on Windows. These >> discrepancies complicate testing on Windows platforms. For example: >> - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\ >> .2024-02-14.20:17.UTC.x86_64 >> - `uname(2)` output: Windows.10.0.20348 >> >> On Windows, uname(2) is not actually system-supplied but is instead >> already faked up by Git itself. We could have overcome the test issue >> on Windows by implementing a new `uname` subcommand in `test-tool` >> using uname(2), but except uname(2), which would be tested against >> itself, there would be nothing platform specific, so it's just simpler >> to disable the tests on Windows. > >OK. > >> +transfer.advertiseOSVersion:: >> + When `true`, the `os-version` capability is advertised by clients and >> + servers. It makes clients and servers send to each other a string >> + representing the operating system name, like "Linux" or "Windows". >> + This string is retrieved from the `sysname` field of the struct returned >> + by the uname(2) system call. Defaults to true. > >Presumably, both ends of the connection independently choose whether they >enable or disable this variable, so we have 2x2=4 combinations (here, versions of >Git before the os-version capability support is introduced behave the same way as >an installation with this configuration variable set to false). > >And among these four combinations, only one of them results in "send to each >other", but the description above is fuzzy. > >> diff --git a/connect.c b/connect.c >> index 10fad43e98..6d5792b63c 100644 >> --- a/connect.c >> +++ b/connect.c >> @@ -492,6 +492,9 @@ static void send_capabilities(int fd_out, struct >packet_reader *reader) >> if (server_supports_v2("agent")) >> packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized()); >> >> + if (server_supports_v2("os-version") && >advertise_os_version(the_repository)) >> + packet_write_fmt(fd_out, "os-version=%s", >os_version_sanitized()); > >Not a new problem, because the new code is pretty-much a straight copy from the >existing "agent" code, but do we ever use unsanitized versions of git-user-agent and >os-version? If not, I am wondering if we should sanitize immediately when we >obtain the raw string and keep it, get rid of _santized() function from the public API, >and make anybody calling git_user_agent() and os_version() to get sanitized safe- >to-use strings. > >I see http.c throws git_user_agent() without doing any sanitization at the cURL >library, but it may be a mistake that we may want to fix (outside the scope of this >topic). Since the contrast between the >os_version() vs the os_version_sanitized() is *new* in this series, however, we >probably would want to get it right from the beginning. > >So the question is again, do we ever need to use os_version() that is a raw string >that may require sanitizing? I do not think of any offhand. uname(2) is definitely not portable. uname(1) is almost always available, but there is no guarantee about uname(2). I am not entirely happy having my builds break if having to write one between rc0 and rc1 when this rolls. How is this being handled? os_version() is also not portable. What if we had something that asked for specific elements of the string, by name or id.