On Mon, Feb 24, 2014 at 2:09 PM, scl <scl.gplus@xxxxxxxxx> wrote: > > Forwarded with Sam Gleske's permission: > > On 24.2.2014 at 3:21 PM Sam Gleske wrote: > >> >> I'm not familiar with the shorthand "sth". I can start running tests >> against www.gimp.org <http://www.gimp.org> and begin to start ironing >> >> out bugs which would need to be fixed in my program. I've run into bugs >> here and there running my crawler against large websites. Right now >> I've crawled and run against Drexel University IRT website using the >> crawler which is ~70k links. I'll do it for www.gimp.org >> <http://www.gimp.org> too and then contribute a Job when I have it ready >> >> (and for the other GIMP websites as well). >> > > I don't know whether this could cause technical problems for www.gimp.org. > The reason why I wrote 'something stable for production use' is that I want > to avoid just them ;-) > Thus it would be good to hear our website administrators opinion to that. > > Both the crawler and the tests following it support a --request-delay. Requests can be arbitrarily delayed by 0.3 seconds (smaller or larger if chosen) so as to mitigate impact of the website. I won't run any tests against the GIMP website until I get permission to do so. Also, currently my testing suite support saving crawl data in a JSON format. The advantage of this would be separating the crawl from the actual testing (if that's desired later on). Currently I mostly use it to debug my test suite because crawling can take a bit of time so saving that data allows me to avoid that step. On Mon, Feb 24, 2014 at 2:09 PM, scl <scl.gplus@xxxxxxxxx> wrote: > > Also, I'd like to note that >> if you're not using git hooks now would be a good time to use them. >> Rather than using Jenkins to constantly poll the SCM simply implement a >> hook to launch a Jenkins job when a certain branch is committed to. >> > > Currently we're polling SCM every few minutes. Adding a git hook to the > repository would surely need involvement of the GNOME administrators. > Is there a good reason to switch over from polling to hooking? > The advantage of hooks vs polling is that hooks are on demand. i.e. a job or process is only launched if there is an actual commit. For the sake of covering all topics I'll briefly describe polling. Polling simply keeps checking the state of the repository periodically (e.g. every 5 minutes). If the state hasn't changed (no commits) then it does nothing. If the state has changed (i.e. commits have been made) then it executes based on what was defined for the Job. Hooks simply take less processing time. On a smaller scale they save energy because they're not constantly polling. It's also lowering the request load of the server serving the repository being polled. It's not a big deal if a server is serving a single repository or a few repositories. However, when you have e.g. 10000 repositories on a server the polling request load of each repository every few minutes starts to eat up the server's ability to serve. For smaller servers it's not the end of the world to have polling because it is easier to implement. I realize the GIMP project may be only a small part of that ecosystem being served but it is something worth considering. SAM _______________________________________________ gimp-developer-list mailing list List address: gimp-developer-list@xxxxxxxxx List membership: https://mail.gnome.org/mailman/listinfo/gimp-developer-list List archives: https://mail.gnome.org/archives/gimp-developer-list