On Sun, Apr 24, 2016 at 7:11 AM, Vijay Bellur <vbellur@xxxxxxxxxx> wrote: > On Sat, Apr 23, 2016 at 9:30 AM, Prasanna Kalever <pkalever@xxxxxxxxxx> wrote: >> Hi all, >> >> Noticed our regression machines are reporting back really slow, >> especially CentOs and Smoke >> >> I found that most of the slaves are marked offline, this could be the >> biggest reasons ? >> >> > > Regression machines are scheduled to be offline if there are no active > jobs. I wonder if the slowness is related to LVM or related factors as > detailed in a recent thread? > Sorry, the previous mail was sent incomplete (blame some Gmail shortcut) Hi Vijay, Honestly I was not aware of this case where the machines move to offline state by them self, I was only aware that they just go to idle state, Thanks for sharing that information. But we still need to reclaim most of machines, Here are the reasons why each of them are offline. CentOs slaves: Hardly (2/14) salves are online [1] slave20.cloud.gluster.org (online) slave21.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave22.cloud.gluster.org (online) slave23.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave24.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave25.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave26.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave27.cloud.gluster.org [Offline Reason: Disconnected by rastar : rastar taking this down for pranith. Needed for debugging with tar issue. Apr 20, 2016 3:44:14 AM] slave28.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave29.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave32.cloud.gluster.org [Offline Reason: idle] slave33.cloud.gluster.org [Offline Reason: idle] slave34.cloud.gluster.org [Offline Reason: idle] slave46.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] Smoke slaves: Hardly (2/15) slaves are online [2] slave20.cloud.gluster.org (onine) slave21.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave22.cloud.gluster.org (online) slave23.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave24.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave25.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave26.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave27.cloud.gluster.org [Offline Reason: Disconnected by rastar : rastar taking this down for pranith. Needed for debugging with tar issue.Apr 20, 2016 3:44:14 AM] slave28.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave29.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave32.cloud.gluster.org [Offline Reason: idle] slave33.cloud.gluster.org [Offline Reason: idle] slave34.cloud.gluster.org [Offline Reason: idle] slave46.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] slave47.cloud.gluster.org [Offline Reason: idle] Netbsd slaves: Only (6 /11) are online [3] nbslave71.cloud.gluster.org (online) nbslave72.cloud.gluster.org [Offline Reason: This node is offline because Jenkins failed to launch the slave agent on it.] nbslave74.cloud.gluster.org [Ofline Reason: Disconnected by kaushal Mar 21, 2016 10:59:43 PM] nbslave75.cloud.gluster.org (online) nbslave77.cloud.gluster.org (online) nbslave79.cloud.gluster.org (online) nbslave7c.cloud.gluster.org (online) nbslave7g.cloud.gluster.org [Ofline Reason: Disconnected by rastar : anoop is using this to debug netbsd related issue Mar 29, 2016 2:27:20 AM] nbslave7h.cloud.gluster.org [Ofline Reason: Disconnected by kaushal Apr 13, 2016 3:15:06 AM] nbslave7i.cloud.gluster.org [Ofline Reason: Disconnected by jdarcy : Consistently generating spurious failures due to ping timeouts. This costs people *hours* for a platform nobody uses except as a test for perfused. Feb 27, 2016 9:09:09 PM] nbslave7j.cloud.gluster.org (online) Summary: For CentOs Regressions: 9/14 slaves were completely down [not just idle] For Smoke: 9/15 slaves were completely down For Netbsd Regressions: 5/11 slaves were completely down. IIRC, for CentOs regression and Smoke jobs we use common machines. so, 9 (CR+S) + 5 (NR) = 14 slaves were down. So on total (Centos [+ Smoke ] + Netbsd) 14/26 machines were down [Not just due to Idle state] https://build.gluster.org/label/rackspace_regression_2gb/ https://build.gluster.org/label/smoke_tests/ https://build.gluster.org/label/netbsd7_regression/ Thanks, -- Prasanna > -Vijay _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel