On Tue, Jul 07, 2015 at 06:04:44PM +0200, Niels de Vos wrote: > On Tue, Jul 07, 2015 at 07:13:53PM +0530, Kaushal M wrote: > > I've taken this slave and one other offline and am rebooting it. > > Reminder that you do not need to take teh system offline for rebooting. > I normally follow these steps to get hung systems back functional: > > 1. verify stuck job, NFS unmount related? > 2. open http://build.gluster.org/view/Infra/job/reboot-vm/build > 3. login on Jenkins > 4. start the reboot-vm job for the stuck system > 5. wait until the job finished > 6. click the "abort" [x] link on the stuck job > 7. retrigger the job after aborting has been done (reload page) > > These hangs do not seem to happen on tests from the master branch > anymore, only on release-3.7. I think this is a confirmation that the > reference counting for auth-cache structures in gluster/nfs is a working > solution. > > We should backport these changes: > > - nfs: add a gf_lock_t for the auth_cache->cache_dict > http://review.gluster.org/11021 > > - core: add "gf_ref_t" for common refcounting structures > http://review.gluster.org/11022 > (already done through http://review.gluster.org/11421) > > - nfs: refcount each auth_cache_entry and related data_t > http://review.gluster.org/11023 > > - refcount: correct the documentation > http://review.gluster.org/11328 > > > I'll try to send backports later this week (maybe Thursday?), unless > someone else beats me to it. Please reply to this thread if you file a > bug for this and send some backports. The above backports have been posted. These should prevent the Gluster/NFS crashes in the regression tests, and therefor prevent the hanging of NetBSD on unmounting NFS (when the NFS-server died). Please check these patches, and merge them when ready: http://review.gluster.org/#/q/status:open+project:glusterfs+branch:release-3.7+topic:bug-1242515 Thanks, Niels > > Thanks, > Niels > > > > > On Tue, Jul 7, 2015 at 6:44 PM, Kotresh Hiremath Ravishankar > > <khiremat@xxxxxxxxxx> wrote: > > > Hi Emmanuel, > > > > > > We are seeing these issues again on nbslave7h.cloud.gluster.org > > > http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/7974/console > > > > > > Thanks and Regards, > > > Kotresh H R > > > > > > ----- Original Message ----- > > >> From: "Emmanuel Dreyfus" <manu@xxxxxxxxxx> > > >> To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>, "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > > >> Sent: Sunday, July 5, 2015 12:52:23 AM > > >> Subject: Re: NetBSD regression tests not Initializing... > > >> > > >> Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx> wrote: > > >> > > >> > Any help is appreciated. > > >> > > >> nbslave72 was sick indeed: it refused SSH connexions. I rebooted it and > > >> retiggered your change, but it went on another machine. > > >> > > >> -- > > >> Emmanuel Dreyfus > > >> http://hcpnet.free.fr/pubz > > >> manu@xxxxxxxxxx > > >> > > > _______________________________________________ > > > Gluster-devel mailing list > > > Gluster-devel@xxxxxxxxxxx > > > http://www.gluster.org/mailman/listinfo/gluster-devel > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxxx > > http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel