Hi Jacques, The test harness sounds good to me. Sounds pretty much spot on what's needed to do integration testing. Geoff. On Wed, 8 Jul 2009, Jacques Mattheij wrote: > hello again, > > One of the tools we discussed was a test harnass that you could wrap > around a gluster module or a stack of modules to put them through > their paces. Like that you can test a lot of different combinations > of modules and their effects on each other in case one module > triggers a bug in another one. > > The test harnass would know exactly what output to expect from > the stack for a given input at the top and what return values to > expect from the lower layers of the stack in the opposite direction. > > The combinations could be preselected or you could automatically > generate an exhaustive list of combinations. The reason for this > is because many gluster bugs will be listed as 'hi I have afr > on top of unify' or some other exotic combination and it is impossible > for the devs to try all these by hand with every release. > > Another thing we talked about was a simple logging module which > you can place anywhere in the stack that will log the information > going up and down without actually doing anything to aid > debugging. (but this is not actually an automated test tool, just > a very handy thing to have). > > Jacques > > Geoff Kassel wrote: > > Hi Mickey, > > Just so that we're all on the same page here - a regression test suite > > at its most basic just has to include test cases (i.e. a set of inputs) > > that can trigger a previously known fault in the code if that fault is > > present. (i.e it can see if the code has 'regressed' into a condition > > where a fault is present.) > > > > What it's also taken to mean (and typically includes) is a set of > > tests cases covering corner cases and normal modes of operation, as > > expressed in a set of inputs to code paired with a set of expected > > outputs that may or may not include error messages. > > > > Test cases aimed at particular levels of the code have specific > > terminology associated with those levels. At the lowest level, the method > > level, they're called unit tests. At the module/API level - integration > > tests. At the system/user interface level - system aka function aka > > functional aka functionality tests. > > > > When new functionality is introduced or a bug is patched, the > > regression test suite (which in the case of unit tests is typically fully > > automated) is run to see whether the expected behaviour occurs, and none > > of the old faults recur. > > > > A lot of the tests you've described fall into the category of function > > tests - and from my background in automated testing, I know we need a bit > > more than that to get the stability and reliability results we want. > > (Simply because you cannot test every corner case within a project the > > size and complexity of GlusterFS reliably from the command line.) > > > > Basically, what GlusterFS needs is a fairly even coverage of test > > cases at all the levels I've just mentioned. > > > > What I want to see particularly - and what the devs stated nearly a > > year ago was already in existence - is unit tests. Particularly the kind > > that can be run automatically. > > > > This is so that developers (inside the GlusterFS team or otherwise) > > can hack on a piece of code to fix a bug or implement new functionality, > > then run the unit tests to see that they (mostly likely) haven't caused a > > regression with their new code. > > > > (It's somewhat difficult for outsiders to write unit and integration > > tests, because typically only the original developers have the in-depth > > knowledge of the expected behaviour of the code in the low level detail > > required.) > > > > Perhaps developed in parallel should be integration and function > > tests. Tests like these (I've outlined elsewhere specifically what kind) > > would have quite likely picked up the data corruption bugs before they > > made their way into the first 2.0.x releases. > > > > (Pretty much anyone familiar with the goal of the project can write > > function tests, documenting in live code their expectations for how the > > system should work.) > > > > Long running stability and load tests like you've proposed are also > > kinds of function tests, but without the narrowly defined inputs and > > outputs of specific test cases. They're basically the equivalent of mine > > shaft canaries - they signal the presence of race conditions, memory > > leaks, design flaws, and other subtle issues, but often without specifics > > as to what 'killed' the canary. Once the cause is found though, a new, > > more specific test case can be added at the appropriate level. > > > > (Useful, yes, but mostly as a starting point for more intensive QA > > efforts.) > > > > The POSIX compliance tests you mentioned are more traditional function > > level tests - but I think the GlusterFS devs have wandered a little away > > from full POSIX compliance on some points, so these tests may not be 100% > > relevant. > > > > (This is not necessarily a bad thing - the POSIX standard is > > apparently ambiguous at times, and there is some wider community feeling > > that improvements to the standard are overdue. And I'm not sure the POSIX > > standard was ever written with massively scalable, plugable, distributed > > file systems in mind, either :) > > > > I hope my extremely long winded rant here :) has explained adequately > > what I feel GlusterFS needs to have in a regression testing system. > > > > Geoff. > > > > On Tue, 7 Jul 2009, Mickey Mazarick wrote: > >> What kind of requirements does everyone see as necessary for a > >> regression test system? > >> Ultimately the best testing system would use the tracing translator and > >> be able to run tests and generate traces for any problems that occurs, > >> giving us something very concrete to provide the developers. That's a > >> few steps ahead however, initially we should start to outline some must > >> haves in terms of how a test setup is run. obviously we want something > >> we can run for many hours or days to test longterm stability, and it > >> would be nice if there was some central way to spin up new clients to > >> test reliability under a load. > >> > >> For basic file operation tests I use the below: > >> An initial look would be to use some tools like > >> http://www.ntfs-3g.org/pjd-fstest.html > >> I've seen it mentioned before but it's a good start to test anything > >> posix. Here's a simple script that will download and build it if it's > >> missing, and run a test on a given mount point. > >> > >> > >> #!/bin/bash > >> if [ "$#" -lt 1 ] > >> then > >> echo "usage: $0 gluster_mount" > >> exit 65 > >> fi > >> GLUSTER_MOUNT=$1 > >> INSTALL_DIR="/usr" > >> if [ ! -d $INSTALL_DIR/fstest ]; then > >> cd $INSTALL_DIR > >> wget http://www.ntfs-3g.org/sw/qa/pjd-fstest-20080816.tgz > >> tar -xzf pjd-fstest-20080816.tgz > >> mv pjd-fstest-20080816 fstest > >> cd fstest > >> make > >> vi tests/conf > >> fi > >> cd $GLUSTER_MOUNT > >> prove -r $INSTALL_DIR/fstest/ > >> > >> Jacques Mattheij wrote: > >>> hello Anand, Geoff & others, > >>> > >>> This pretty much parallels my interaction with the team about a > >>> year ago, lots of really good intentions but no actual follow up. > >>> > >>> We agreed that an automated test suite was a must and that a > >>> whole bunch of other things would have to be done to get > >>> glusterfs out of the experimental stage and into production > >>> grade. > >>> > >>> It's a real pity because I still feel that glusterfs is one of the > >>> major contenders to become *the* cluster file system. > >>> > >>> A lot of community goodwill has been lost, I've kept myself > >>> subscribed to this mailing list because I hoped that at some > >>> point we'd move past this endless cat and mouse game with > >>> stability issues but for some reason that never happend. > >>> > >>> Anand, you have a very capable team of developers, you have > >>> a once-in-a-lifetime opportunity to make this happen please > >>> take Geoff's comments to hart and get serious about Q&A and > >>> community support because that is the key to any successful > >>> foss project. Fan that fire and you can't go wrong, lose the > >>> community support and your project might as well be dead. > >>> > >>> I realize this may come across as harsh but it is intended to > >>> make it painfully obvious that the most staunch supporters > >>> of glusterfs are getting discouraged and that is a loss no > >>> serious project can afford. > >>> > >>> Jacques > >>> > >>> Geoff Kassel wrote: > >>>> Hi Anand, > >>>> If you look back through the list archives, no one other than me > >>>> replied to the original QA thread where I first posted my patches. > >>>> Nor to the Savannah patch tracker thread where I also posted my > >>>> patches. (Interesting how those trackers have been disabled now...) > >>>> > >>>> It took me pressing the issue after discovering yet another bug > >>>> that we even started talking about my patches. So yes, my patches > >>>> were effectively ignored. > >>>> > >>>> At the time, you did mention that the code the patches were to be > >>>> applied against was being reworked, in addition to your comments > >>>> about my code comments. > >>>> > >>>> I explained the comments as being necessary to avoid the automated > >>>> tool flagging potential issues again on reuse of that tool - other > >>>> comments for future QA work. There was no follow up on that from you, > >>>> nor suggestion on how I might improve these comments to your > >>>> standards. > >>>> > >>>> I continued to supply patches in the Savannah tracker against the > >>>> latest stable 1.3 branch - which included some refactoring for your > >>>> reworked code, IIRC - for some time after that discussion. All of my > >>>> patches were in sync with the code from publically available 1.3 > >>>> branch repository within days of a new TLA patchset. > >>>> > >>>> None of these were adopted either. > >>>> > >>>> I simply ran out of spare time to maintain this patchset, and I > >>>> got tired of pressing an issue (QA) that you and the dev team clearly > >>>> weren't interested in. > >>>> > >>>> I don't have the kind of spare time needed to do the sort of > >>>> in-depth re-audit your code from scratch (as would be needed) in the > >>>> manner that I did back then. So I can't meet your request at this > >>>> time, sorry. > >>>> > >>>> As I've suggested elsewhere, now that you apparently have the > >>>> resources for a stand-alone QA team - this team might want to at > >>>> least use the tools I've used to generate these patches - RATS and > >>>> FlawFinder. > >>>> > >>>> That way you can generate the kind of QA work I was producing with > >>>> the kind of comment style you prefer. > >>>> > >>>> The only way I can conceive of being able to help now is in > >>>> patching individual issues. However, I can really only feasibly do > >>>> that with my time constraints if I've got regression tests to make > >>>> sure I'm not inadvertently breaking other functionality. > >>>> > >>>> Hence my continued requests for these. > >>>> > >>>> Geoff. > >>>> > >>>> On Tue, 7 Jul 2009, Anand Avati wrote: > >>>>>> I've also gone one better than just advice - I've given up > >>>>>> significant > >>>>>> portions of my limited spare time to audit and patch a > >>>>>> not-insignificant > >>>>>> portion of the GlusterFS code, in order to deal with the stability > >>>>>> issues > >>>>>> I and others were encountering. My patches were ignored, on the > >>>>>> grounds > >>>>>> that it contained otherwise unobtrusive comments which were quite > >>>>>> necessary to the audit. > >>>>> > >>>>> Geoff, we really appreciate your efforts, both on the fronts of your > >>>>> patch submissions and for voicing your opinions freely. We also > >>>>> acknowledge the positive intentions behind this thread. As far as > >>>>> your patch submissions are concerned, there is probably a > >>>>> misunderstanding. Your patches were not ignored. We do value your > >>>>> efforts. The patches which you submitted, even at the time of your > >>>>> submission were not applicable to the codebase. > >>>>> > >>>>> Patch 1 (in glusterfsd.c) -- this file was reworked and almost > >>>>> rewritten from scratch to work as both client and server. > >>>>> > >>>>> Patch 2 (glusterfs-fuse/src/glusterfs.c) -- this module was > >>>>> reimplemented as a new translator (since a separate client was no > >>>>> more needed). > >>>>> > >>>>> Patch 3 (protocol.c) -- with the introduction of non blocking IO and > >>>>> binary protocol, nothing of this file remained. > >>>>> > >>>>> What I am hoping to convey is that, the reason your patches did not > >>>>> make it to the repository was because it needed significant reworking > >>>>> to even apply. I did indeed comment about code comments of the style > >>>>> /* FlawFinder: */ but then, that definitely was _not_ the reason they > >>>>> weren't included. Please understand that nothing was ignored > >>>>> intentionally. > >>>>> > >>>>> This being said, I can totally understand the efforts which you have > >>>>> been putting to maintain patchsets by yourself and keeping them up to > >>>>> date with the repository. I request you to resubmit them (with git > >>>>> format-patch) against the HEAD of the repository. > >>>>> > >>>>> Thanks, > >>>>> Avati > >>>> > >>>> _______________________________________________ > >>>> Gluster-devel mailing list > >>>> Gluster-devel@xxxxxxxxxx > >>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxx > > http://lists.nongnu.org/mailman/listinfo/gluster-devel