On Thu, Oct 17, 2019 at 3:25 PM <Tim.Bird@xxxxxxxx> wrote: > > > > > -----Original Message----- > > From: Theodore Y. Ts'o on October 17, 2019 2:09 AM > > > > On Wed, Oct 16, 2019 at 05:26:29PM -0600, Shuah Khan wrote: > > > > > > I don't really buy the argument that unit tests should be deterministic > > > Possibly, but I would opt for having the ability to feed test data. > > > > I strongly believe that unit tests should be deterministic. > > Non-deterministic tests are essentially fuzz tests. And fuzz tests > > should be different from unit tests. > > I'm not sure I have the entire context here, but I think deterministic > might not be the right word, or it might not capture the exact meaning > intended. > > I think there are multiple issues here: > 1. Does the test enclose all its data, including working data and expected results? > Or, does the test allow someone to provide working data? This alternative > implies that either the some of testcases or the results might be different depending on > the data that is provided. IMHO the test would be deterministic if it always produced > the same results based on the same data inputs. And if the input data was deterministic. > I would call this a data-driven test. > > Since the results would be dependent on the data provided, the results > from tests using different data would not be comparable. Essentially, > changing the input data changes the test so maybe it's best to consider > this a different test. Like 'test-with-data-A' and 'test-with-data-B'. That kind of sound like parameterized tests[1]; it was a feature I was thinking about adding to KUnit, but I think the general idea of parameterized tests has fallen out of favor; I am not sure why. In any case, I have used parameterized tests before and have found them useful in certain circumstances. > 2. Does the test automatically detect some attribute of the system, and adjust > its operation based on that (does the test probe?) This is actually quite common > if you include things like when a test requires root access to run. Sometimes such tests, > when run without root privilege, run as many testcases as possible not as root, and skip > the testcases that require root. > > In general, altering the test based on probed data is a form of data-driven test, > except the data is not provided by the user. Whether > this is deterministic in the sense of (1) depends on whether the data that > is probed is deterministic. In the case or requiring root, then it should > not change from run to run (and it should probably be reflected in the characterization > of the results). > > Maybe neither of the above cases fall in the category of unit tests, but > they are not necessarily fuzzing tests. IMHO a fuzzing test is one which randomizes Kind of sounds remotely similar to Haskell's QuickCheck[2]; it's sort of a mix of unit testing and fuzz testing. I have used this style of testing for other projects and it can be pretty useful. I actually have a little experiment somewhere trying to port the idea to KUnit. > the data for a data-driven test (hence using non-deterministic data). Once the fuzzer > has found a bug, and the data and code for a test is fixed into a reproducer program, > then at that point it should be deterministic (modulo what I say about race condition > tests below). > > > > > We want unit tests to run quickly. Fuzz tests need to be run for a > > large number of passes (perhaps hours) in order to be sure that we've > > hit any possible bad cases. We want to be able to easily bisect fuzz > > tests --- preferably, automatically. And any kind of flakey test is > > hell to bisect. > Agreed. > > > It's bad enough when a test is flakey because of the underlying code. > > But when a test is flakey because the test inputs are > > non-deterministic, it's even worse. > I very much agree on this as well. > > I'm not sure how one classes a program that seeks to invoke a race condition. > This can take variable time, so in that sense it is not deterministic. But it should > produce the same result if the probabilities required for the race condition > to be hit are fulfilled. Probably (see what I did there :-), one needs to take > a probabilistic approach to reproducing and bisecting such bugs. The duration > or iterations required to reproduce the bug (to some confidence level) may > need to be included with the reproducer program. I'm not sure if the syskaller > reproducers do this or not, or if they just run forever. One I looked at ran forever. > But you would want to limit this in order to produce results with some confidence > level (and not waste testing resources). > > --- > The reason I want get clarity on the issue of data-driven tests is that I think > data-driven tests and tests that probe are very much desirable. This allows a > test to be able to be more generalized and allows for specialization of the > test for more scenarios without re-coding it. > I'm not sure if this still qualifies as unit testing, but it's very useful as a means to > extend the value of a test. We haven't trod into the mocking parts of kunit, > but I'm hoping that it may be possible to have that be data-driven (depending on > what's being mocked), to make it easier to test more things using the same code. I imagine it wouldn't be that hard to add that on as a feature of a parameterized testing implementation. > Finally, I think the issue of testing speed is orthogonal to whether a test is self-enclosed > or data-driven. Definitely fuzzers, which are experimenting with system interaction > in a non-deterministic way, have speed problems. [1] https://dzone.com/articles/junit-parameterized-test [2] http://hackage.haskell.org/package/QuickCheck Cheers!