Re: "make check" should be optional

Gregory Farnum <gfarnum@xxxxxxxxxx> · Tue, 13 Mar 2018 13:38:47 -0700

On Tue, Mar 13, 2018 at 7:38 AM, Alfredo Deza <adeza@xxxxxxxxxx> wrote:
> On Tue, Mar 13, 2018 at 10:03 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>> On Tue, 13 Mar 2018, Alfredo Deza wrote:
>>> The current "make check" job on pull requests is configured to require
>>> a "passing/OK" state to allow a merge.
>>>
>>> Looking back at the past 100 builds since March 13th, there is roughly a 20%
>>> failure rate [0]. This is a similar failure rate for ceph-volume PRs which never
>>> hit any make check paths: 6 failures out of the last 25 ceph-volume
>>> pull requests have
>>> make check failures).
>>>
>>> These failures in make check means that we must almost always ignore them, and
>>> use administrator privilege to merge. This is far from ideal, and further
>>> reduces the confidence in the tests.
>>>
>>> Some of the failures are produced by code that implies a grey area, enough to
>>> do a non-zero exit status:
>>>
>>>     /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t:
>>> failed
>>>     --- /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t
>>>     +++ /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t.err
>>>     @@ -40,6 +40,7 @@
>>>      # it is almost impossible to get the same stats with random and crush
>>>      # if they are, it most probably means something went wrong somewhere
>>>        $ test "$STATS_CRUSH" != "$STATS_RANDOM"
>>>     +  [1]
>>>     # Ran 13 tests, 0 skipped, 1 failed.
>>
>> If this is a nondeterministic test case then we should remove it!
>>
>> The harder case are the ones that are nondeterministic because of
>> environmental conditions.  I think we don't understand the why well enough
>> to fix (or skip).
>
> This is kind of what I was looking for as well: the possibility of
> start pruning tests that aren't working well for us. Since there seems
> to be
> a strong interest in just keeping make check around as-is.
>
> I don't know enough of these tests, otherwise I would offer to start
> helping here. In the case of ceph-disk, I think in *master* they could
> be removed from make check entirely
> and rely on ad-hoc ceph-disk testing when targetted PRs show up. That
> would reduce a chunk of time that is spent on setting up the ceph-disk
> test environment.

I don't think anybody in the project any more knows about those tests
much. I'd recommend just creating bugs for non-deterministic tests
when you run across them and we can start working our way through them
as a group to make our tests more useful overall.
(Just anecdotally, I see failures from machine disconnects or whatever
a lot more often than issues in "make check". But I don't do enough
with it to run statistics.)

As for turning them off for ceph-disk...oh, I think I misunderstood
what you were proposing. Are you saying those are some of the noisy
tests? And as we move to ceph-volume there's little point testing
ceph-disk in master?
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html