Re: Monitoring a repository for changes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 21 2017, Eric Wong jotted:

> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote:
>> On Wed, Jun 21 2017, Tim Hutt jotted:
>>
>> > Hi,
>> >
>> > Currently if you want to monitor a repository for changes there are
>> > three options:
>> >
>> > * Polling - run a script to check for updates every 60 seconds.
>> > * Server side hooks
>> > * Web hooks (on Github, Bitbucket etc.)
>> >
>> > Unfortunately for many (most?) cases server-side hooks and web hooks
>> > are not suitable. They require you to both have admin access to the
>> > repo and have a public server available to push updates to. That is a
>> > huge faff when all I want to do is run some local code when a repo is
>> > updated (e.g. play a sound).
>
> Yeah, it kinda sucks that way.
>
> Currently, for one of my public-inbox mirrors which has ssh
> access to the primary server on public-inbox.org, I have:
>
> 	#!/bin/sh
> 	while true
> 	do
> 		# GNU tail(1) uses inotify to avoid polling on Linux
> 		ssh public-inbox.org tail -F /path/to/git-vger.git/info/refs | \
> 				while read sha1 ref
> 		do
> 			for GIT_DIR in git-vger.git
> 			do
> 				export GIT_DIR
> 				git fetch || continue
> 				git update-server-info
> 				public-inbox-index # update Xapian index
> 			done
> 		done
> 	done
>
> It's not perfect as it requires multiple processes on the
> server, but it's better than polling for my limited use.
>
>> > Currently people resort to polling
>> > (https://stackoverflow.com/a/5199111/265521) which is just ugly. I
>> > would like to propose that there should be a forth option that uses a
>> > persistent connection to monitor the repo. It would be used something
>> > like this:
>> >
>> >     git watch https://github.com/git/git.git
>> >
>> > or
>> >
>> >     git watch git@xxxxxxxxxx:git/git.git
>> >
>> > It would then print simple messages to stdout. The complexity of what
>> > it prints is up for debate, - it could be something as simple as
>> > "PUSH\n", or it could include more information, e.g. JSON-encoded
>> > information about the commits. I'd be happy with just "PUSH\n" though.
>>
>> Insofar as this could be implemented in some standard way in Git it's
>> likely to have a large overlap with the "protocol v2" that keeps coming
>> up here on-list. You might want to search for past threads discussing
>> that.
>
> Yeah, it hasn't been a priority for me, either...
>
>> > In terms of implementation, the HTTP transport could use Server-Sent
>> > Events, and the SSH transport can pretty much do whatever so that
>> > should be easy.
>>
>> In case you didn't know, any of the non-trivially sized git hosting
>> providers (e.g. github, gitlab) provide you access over ssh, but you
>> can't just run any arbitrary command, it's a tiny set of whitelisted
>> commands. See the "git-shell" manual page (github doesn't use that exact
>> software, but something similar).
>>
>> But overall, it would be nice to have some rationale for this approach
>> other than that you think polling is ugly. There's a lot of advantages
>> to polling for something you don't need near-instantly, e.g. imagine how
>> many active connections a site like GitHub would need to handle if
>> something like this became widely used, that's in a lot of ways harder
>> to scale and load balance than just having clients that poll something
>> that's trivially cached as static content.
>
> Polling becomes more expensive with TLS and high-latency
> connections, and also increases power consumption if done
> frequently for redundancy purposes.
>
> I've long wanted to do something better to allow others to keep
> public-inbox mirrors up-to-date.  Having only 64-128 bytes of
> overhead per userspace per-connection should be totally doable
> based on my experience working on cmogstored; at which point
> port exhaustion will become the limiting factor (or TLS overhead
> for HTTPS).

Come to think of it I should probably have asked you about this, but I
have a one-liner running that polls every 5 minutes, but will stop if I
haven't changed my git.git in a day:

    while true; do if test $(find ~/g/git -type f -mmin -1440 | wc -l) -gt 0; then git pull; else echo too old; fi ; date ; sleep 300; done

> But perhaps a cheaper option might be the traditional email/IRC
> notification and having a client-side process watch for that
> before fetching.

If there was a IRC channel with this info I could/would use that,
getting it via E-Mail would just get me into the same problem
public-inbox is currently solving for me, i.e. I might as well keep the
git ML up-to-date on that machine if I'm going to otherwise need to
subscribe to a "hey there's a new message on the git ML" list :)



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux