Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > On Thu, May 02 2019, Eric Wong wrote: > > > Stefan Beller <sbeller@xxxxxxxxxx> wrote: > >> IIRC, More than half the bandwidth of Googles git servers are used > >> for ls-remote calls (i.e. polling a lot of repos, most of them did *not* > >> change, by build bots which are really eager to try again after a minute). > > > > Thinking back at that statement; I think polling can be > > optimized in git, at least. > > > > IIRC, your repos have lots of refs; right? > > (which is why it's a bandwidth problem) > > > > Since info/refs is a static file (hopefully updated by a > > post-update hook), the smart client can make an HTTP request > > to check If-Modified-Since: to avoid the big response. > > > > The client would need to cache the mtime of the last requested > > refs file; somewhere. > > > > IOW, do refs negotiation the "dumb" way; since it's no better > > than the smart way, really. Keep doing object transfers the > > smart way. > > > > During the initial clone, smart servers could probably > > have a header informing clients that their info/refs > > is up-to-date and clients can do dumb refs negotiation. > > Doing this with If-Modified-Since sounds like an easier drop-in > replacement (just needs a client change), but I wonder if ETag isn't a > better fit for this. ETags overall could work. > I.e. we'd document some convention where the ETag is a hash of the refs > the client expects to be advertised in some format, it then sends that > to the server. But I was hoping to avoid the overhead of spawning git-http-backend entirely. And there's no consistent way to configure ETags on different static servers. > That allows the same thing without anyone keeping more state than they > keep now in their local ref store I think caching the remote info/refs is useful anyways in case the user changes their fetch refspec, and it could speed up invocations of "git ls-remote". > On the fancier side I think bloom filters are something that's been > discussed (and I believe someone (Twitter?) had such an internal patch), > i.e. the client sends a bloom filter of refs they have, and the server > advertises things they don't know about yet (and due to how bloom > filters work, some things they *do* know about already but tripped up > the bloom filter...). I'm not smart enough to understand such fancy things :)