On 2024-03-14 at 23:29:40, Junio C Hamano wrote: > "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes: > > >> A radical counter-proposal for the design is to update the client > >> side to do this unconditionally, without needing any new option. > > > > I'm not sure that would be a great idea. > > Thanks. I was looking for a push-back ;-) > > > Since it's a push, that will > > trigger authentication, which may prompt the user (e.g., for a password > > or token or for a YubiKey touch with FIDO2 SSH) and which they might be > > able to easily avoid. > > Do you mean we first go unauthenticated to find out what commits are > at the tip of refs at the destination repository, and then switch to > authenticated push after we find out there is something that is > worth pushing? I somehow thought we need an authenticated access > only for the initial ls-refs exchange. It depends on the server and the protocol. Protocol v0 allows unauthenticated ref exchange, and then the push itself can be authenticated. It is customary to require authentication on both pieces, but not mandatory. > > As a server operator, I also expect that there > > are people doing lots of needless attempts at pushing in automated > > systems (because with enough users, there will be at least some who do > > bizarre or inefficient things), and I would prefer to avoid serving > > those requests if I don't need to. (For example, for us, reference > > updates need to go through a distributed commit protocol to update > > multiple replicas of the repository, and if there's no ref updates, then > > we cut out multiple services which we don't need to contact.) > > Yes, but if you have an extra no-op ref update in the bunch, that > one is excluded from the set of changes to be synchronised across > replicas, no? As I said, we can't do that, because we have to verify that the old value is correct. Even if we did perform that optimization, once we know we have a write, we have to contact every replica (in case the user does indeed send us a pack, which is allowed by the protocol, even if bizarre), whereas with a ref request or a read, we can satisfy that with a single read replica. I'm sure I can satisfy at least one to two orders of magnitude more no-op push attempts with the current optimization than I could if we didn't have it. It might even be to the point of three or more orders of magnitude. (My guess is that reftable will improve this even further, but I haven't yet measured.) I'm sure this is also true for most other implementations; serving reads, like ls-refs, are almost always substantially cheaper than writes and most server implementations are highly optimized for serving large numbers of reads due to CI farms and such. > > Note also that no-op ref updates cannot be simply omitted on the server > > side because we need to verify that the old value for the ref is > > correct, or we need to reject the operation as out of date. > > Yes, but isn't that something the user would rather want to find out > earlier rather than later? Your push without no-op update may say > "ah, we are up to date" when we are behind or worse diverged. If we > do the no-op update more often, we'd have more chance to catch such > a situation where the worldview the end-user's repository has is out > of sync with reality. Well, we were up to date at the time the ls-refs operation completed, or the push wouldn't have the same old and new OIDs. The client has to declare the old ref value that it got from the ls-refs output in order for the operation to succeed, since the server-side Git won't update the references if the old value doesn't match. (This is how the client knows whether a push is a fast-forward or not: it computes it, and if the remote side has an unknown value, then it's not.) Other than updating the push options, there's no functional benefit to making a second request, since the ls-refs output already told us where the branch is. That's why the current optimization is functionally correct. In any event, there are alternate implementations of the Git protocol which did not implement our current optimization and did, for purposes of mirroring, make excessive numbers of no-op requests, and we did have to ask for that to be fixed. I assure you, the no-op behaviour is better for users because it's faster and substantially more likely to succeed (because it is effectively a read-only ls-refs request), whereas a new ref update is less efficient, slower, and substantially more likely to fail (due to functional limitations on write throughput). I also expect there will be poorly written `pre-push` and `pre-receive` hooks which will not like getting a no-op update and not handle it gracefully. I'm pretty sure Git LFS will handle this just fine, but lots of people have various shell scripts that probably will not. -- brian m. carlson (they/them or he/him) Toronto, Ontario, CA
Attachment:
signature.asc
Description: PGP signature