On Wed, Jun 21 2017, Eric Wong jotted: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: >> On Wed, Jun 21 2017, Tim Hutt jotted: >> >> > Hi, >> > >> > Currently if you want to monitor a repository for changes there are >> > three options: >> > >> > * Polling - run a script to check for updates every 60 seconds. >> > * Server side hooks >> > * Web hooks (on Github, Bitbucket etc.) >> > >> > Unfortunately for many (most?) cases server-side hooks and web hooks >> > are not suitable. They require you to both have admin access to the >> > repo and have a public server available to push updates to. That is a >> > huge faff when all I want to do is run some local code when a repo is >> > updated (e.g. play a sound). > > Yeah, it kinda sucks that way. > > Currently, for one of my public-inbox mirrors which has ssh > access to the primary server on public-inbox.org, I have: > > #!/bin/sh > while true > do > # GNU tail(1) uses inotify to avoid polling on Linux > ssh public-inbox.org tail -F /path/to/git-vger.git/info/refs | \ > while read sha1 ref > do > for GIT_DIR in git-vger.git > do > export GIT_DIR > git fetch || continue > git update-server-info > public-inbox-index # update Xapian index > done > done > done > > It's not perfect as it requires multiple processes on the > server, but it's better than polling for my limited use. > >> > Currently people resort to polling >> > (https://stackoverflow.com/a/5199111/265521) which is just ugly. I >> > would like to propose that there should be a forth option that uses a >> > persistent connection to monitor the repo. It would be used something >> > like this: >> > >> > git watch https://github.com/git/git.git >> > >> > or >> > >> > git watch git@xxxxxxxxxx:git/git.git >> > >> > It would then print simple messages to stdout. The complexity of what >> > it prints is up for debate, - it could be something as simple as >> > "PUSH\n", or it could include more information, e.g. JSON-encoded >> > information about the commits. I'd be happy with just "PUSH\n" though. >> >> Insofar as this could be implemented in some standard way in Git it's >> likely to have a large overlap with the "protocol v2" that keeps coming >> up here on-list. You might want to search for past threads discussing >> that. > > Yeah, it hasn't been a priority for me, either... > >> > In terms of implementation, the HTTP transport could use Server-Sent >> > Events, and the SSH transport can pretty much do whatever so that >> > should be easy. >> >> In case you didn't know, any of the non-trivially sized git hosting >> providers (e.g. github, gitlab) provide you access over ssh, but you >> can't just run any arbitrary command, it's a tiny set of whitelisted >> commands. See the "git-shell" manual page (github doesn't use that exact >> software, but something similar). >> >> But overall, it would be nice to have some rationale for this approach >> other than that you think polling is ugly. There's a lot of advantages >> to polling for something you don't need near-instantly, e.g. imagine how >> many active connections a site like GitHub would need to handle if >> something like this became widely used, that's in a lot of ways harder >> to scale and load balance than just having clients that poll something >> that's trivially cached as static content. > > Polling becomes more expensive with TLS and high-latency > connections, and also increases power consumption if done > frequently for redundancy purposes. > > I've long wanted to do something better to allow others to keep > public-inbox mirrors up-to-date. Having only 64-128 bytes of > overhead per userspace per-connection should be totally doable > based on my experience working on cmogstored; at which point > port exhaustion will become the limiting factor (or TLS overhead > for HTTPS). Come to think of it I should probably have asked you about this, but I have a one-liner running that polls every 5 minutes, but will stop if I haven't changed my git.git in a day: while true; do if test $(find ~/g/git -type f -mmin -1440 | wc -l) -gt 0; then git pull; else echo too old; fi ; date ; sleep 300; done > But perhaps a cheaper option might be the traditional email/IRC > notification and having a client-side process watch for that > before fetching. If there was a IRC channel with this info I could/would use that, getting it via E-Mail would just get me into the same problem public-inbox is currently solving for me, i.e. I might as well keep the git ML up-to-date on that machine if I'm going to otherwise need to subscribe to a "hey there's a new message on the git ML" list :)