On Tue, Jul 07, 2015 at 07:13:53PM +0530, Kaushal M wrote: > I've taken this slave and one other offline and am rebooting it. Reminder that you do not need to take teh system offline for rebooting. I normally follow these steps to get hung systems back functional: 1. verify stuck job, NFS unmount related? 2. open http://build.gluster.org/view/Infra/job/reboot-vm/build 3. login on Jenkins 4. start the reboot-vm job for the stuck system 5. wait until the job finished 6. click the "abort" [x] link on the stuck job 7. retrigger the job after aborting has been done (reload page) These hangs do not seem to happen on tests from the master branch anymore, only on release-3.7. I think this is a confirmation that the reference counting for auth-cache structures in gluster/nfs is a working solution. We should backport these changes: - nfs: add a gf_lock_t for the auth_cache->cache_dict http://review.gluster.org/11021 - core: add "gf_ref_t" for common refcounting structures http://review.gluster.org/11022 (already done through http://review.gluster.org/11421) - nfs: refcount each auth_cache_entry and related data_t http://review.gluster.org/11023 - refcount: correct the documentation http://review.gluster.org/11328 I'll try to send backports later this week (maybe Thursday?), unless someone else beats me to it. Please reply to this thread if you file a bug for this and send some backports. Thanks, Niels > > On Tue, Jul 7, 2015 at 6:44 PM, Kotresh Hiremath Ravishankar > <khiremat@xxxxxxxxxx> wrote: > > Hi Emmanuel, > > > > We are seeing these issues again on nbslave7h.cloud.gluster.org > > http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/7974/console > > > > Thanks and Regards, > > Kotresh H R > > > > ----- Original Message ----- > >> From: "Emmanuel Dreyfus" <manu@xxxxxxxxxx> > >> To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>, "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > >> Sent: Sunday, July 5, 2015 12:52:23 AM > >> Subject: Re: NetBSD regression tests not Initializing... > >> > >> Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx> wrote: > >> > >> > Any help is appreciated. > >> > >> nbslave72 was sick indeed: it refused SSH connexions. I rebooted it and > >> retiggered your change, but it went on another machine. > >> > >> -- > >> Emmanuel Dreyfus > >> http://hcpnet.free.fr/pubz > >> manu@xxxxxxxxxx > >> > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxxx > > http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel
Attachment:
pgpp8ajuIn00H.pgp
Description: PGP signature
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel