From: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>
To: "Emmanuel Dreyfus" <manu@xxxxxxxxxx>, "Ravishankar N" <ravishankar@xxxxxxxxxx>
Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>, "gluster-infra" <gluster-infra@xxxxxxxxxxx>
Sent: Friday, January 8, 2016 11:45:20 AM
Subject: Re: NetBSD tests not running to completion.On 01/07/2016 02:39 PM, Emmanuel Dreyfus wrote:
> On Wed, Jan 06, 2016 at 05:49:04PM +0530, Ravishankar N wrote:
>> I re triggered NetBSD regressions for http://review.gluster.org/#/c/13041/3
>> but they are being run in silent mode and are not completing. Can some one
>> from the infra-team take a look? The last 22 tests in
>> https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/ have
>> failed. Highly unlikely that something is wrong with all those patches.
> I note your latest test compelted with an error in mount-nfs-auth.t:
> https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/13260/consoleFull
>
> Would you have the jenkins build that did not complete s that I can have a
> look at it?
>
> Generally speaking, I have to pôint that NetBSD regression does show light
> on generic bugs, we had a recent exemple with quota-nfs.t. For now there
> are not other well supported platforms, but if you want glusterfs to
> be really portable, removing mandatory NetBSD regression is not a good idea:
> portability bugs will crop.
>
> Even a daily or weekly regression run seems a bad idea to me. If you do not
> prevent integration of patches that break NetBSD regression, that will get
> in, and tests will break one by one over time. I have a first hand
> experience of this situation, when I was actually trying to catch on with
> NetBSD regression. Many time I reached something reliable enough to become
> mandatory, and got broken by a new patch before it became actualy mandatory.
>
> IMO, relaxing NetBSD regression requirement means the project drops the goal
> of being portable.
>
hi Emmanuel,
This Sunday I have some time I can spend helping in making
tests better for NetBSD. I have seen bugs that are caught only by NetBSD
regression just recently, so I see value in making NetBSD more reliable.
Please let me know what are the things we can work on. It would help if
you give me something specific to glusterfs to make it more valuable in
the short term. Over time I would like to learn enough to share the load
with you however little it may be (Please bear with me, I some times go
quiet). Here are the initial things I would like to know to begin with:
Please count me in too!
-Krutika
1) How to set up NetBSD VMs on my laptop which is of exact version as
the ones that are run on build systems.
2) How to prevent NetBSD machines hang when things crash (At least I
used to see that the machines hang when fuse crashes before, not sure if
this is still the case)? (This failure needs manual intervention at the
moment on NetBSD regressions, if we make it report failures and pick
next job that would be the best way forward)
3) We should come up with a list of known problems and how to
troubleshoot those problems, when things are not going smooth in NetBSD.
Again, we really need to make things automatic, this should be last
resort. Our top goal should be to make NetBSD machines report failures
and go to execute next job.
4) How can we make debugging better in NetBSD? In the worst case we can
make all tests execute in trace/debug mode on NetBSD.I really want to appreciate the fine job you have done so far in making
sure glusterfs is stable on NetBSD.Infra team,
I think we need to make some improvements to our infra. We need
to get information about health of linux, NetBSD regression builds.
1) Something like, in the last 100 builds how many builds succeeded on
Linux, how many succeeded on NetBSD.
2) What are the tests that failed in the last 100 builds and how many
times on both Linux and NetBSD. (I actually wrote this part in some
parts, but the whole command output has changed making my scripts stale)
Any other ideas you guys have?
3) Which components have highest number of spurious failures.
4) How many builds did not complete/manually aborted etc.Once we start measuring these things, next steps are to setup a process
in place to get the health of the project stable and keep it that way.Please let me know if anyone wants to volunteer to make things better in
this infra part. Most of the code will be in python.Pranith
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel