Re: [Gluster-infra] Downtime for Jenkins

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/17/2015 02:32 PM, Vijay Bellur wrote:
[Adding gluster-devel]

On 05/16/2015 11:31 PM, Niels de Vos wrote:
On Sat, May 16, 2015 at 06:32:00PM +0200, Niels de Vos wrote:
It seems that many failures of the regression tests (at least for
NetBSD) are caused by failing to reconnect to the slave. Jenkins tries
to keep a control connection open to the slaves, and reconnects when the
connection terminates.

I do not know why the connection is disrupted, but I can see that
Jenkins is not able to resolve the hostname of the slave. For example,
from (well, you have to find the older logs, Jenkins seems to have
automatically reconnected)
http://build.gluster.org/computer/nbslave72.cloud.gluster.org-v2/log :

     java.io.IOException: There was a problem while connecting to
nbslave71.cloud.gluster.org:22
     ...
     Caused by: java.net.UnknownHostException:
nbslave71.cloud.gluster.org: Name or service not known


The error in the console log of the regression test is less helpful, it
only states the disconnection failure:


http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/5408/console


In fact, this looks very much related to these reports:

- https://issues.jenkins-ci.org/browse/JENKINS-19619 duplicate of 18879
- https://issues.jenkins-ci.org/browse/JENKINS-18879

This problem should be fixed in Jenkins 1.524 and newer. Time to upgrade
Jenkins too?

Yes, I have started an upgrade. Please expect a downtime for Jenkins
during the upgrade.

I will update once the activity is complete.


Upgrade to Jenkins v1.613 is now complete and Jenkins seems to be largely doing fine. Several plugins of Jenkins have also been updated to their latest versions. During the course of the upgrade, I noticed that we were using the deprecated 'gerrit approve' interface to intimate status of a smoke run. Have changed that to use 'gerrit review' and this seems to have addressed the problem of smoke tests not reporting status back to gerrit.

There were a few instances of Jenkins not being able to launch slaves through ssh but was later successful upon automatic retries. We will need to watch this behavior to see if this problem persists and comes in the way of normal functioning.

Manu - can you please verify and report back if the NetBSD slaves work better with the upgraded Jenkins master?

All - please drop a note on gluster-infra if you happen to notice problems with Jenkins.

Thanks,
Vijay



_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux