Re: "make check" should be optional

Alfredo Deza <adeza@xxxxxxxxxx> · Tue, 13 Mar 2018 17:26:47 -0400

On Tue, Mar 13, 2018 at 4:38 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> On Tue, Mar 13, 2018 at 7:38 AM, Alfredo Deza <adeza@xxxxxxxxxx> wrote:
>> On Tue, Mar 13, 2018 at 10:03 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>>> On Tue, 13 Mar 2018, Alfredo Deza wrote:
>>>> The current "make check" job on pull requests is configured to require
>>>> a "passing/OK" state to allow a merge.
>>>>
>>>> Looking back at the past 100 builds since March 13th, there is roughly a 20%
>>>> failure rate [0]. This is a similar failure rate for ceph-volume PRs which never
>>>> hit any make check paths: 6 failures out of the last 25 ceph-volume
>>>> pull requests have
>>>> make check failures).
>>>>
>>>> These failures in make check means that we must almost always ignore them, and
>>>> use administrator privilege to merge. This is far from ideal, and further
>>>> reduces the confidence in the tests.
>>>>
>>>> Some of the failures are produced by code that implies a grey area, enough to
>>>> do a non-zero exit status:
>>>>
>>>>     /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t:
>>>> failed
>>>>     --- /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t
>>>>     +++ /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t.err
>>>>     @@ -40,6 +40,7 @@
>>>>      # it is almost impossible to get the same stats with random and crush
>>>>      # if they are, it most probably means something went wrong somewhere
>>>>        $ test "$STATS_CRUSH" != "$STATS_RANDOM"
>>>>     +  [1]
>>>>     # Ran 13 tests, 0 skipped, 1 failed.
>>>
>>> If this is a nondeterministic test case then we should remove it!
>>>
>>> The harder case are the ones that are nondeterministic because of
>>> environmental conditions.  I think we don't understand the why well enough
>>> to fix (or skip).
>>
>> This is kind of what I was looking for as well: the possibility of
>> start pruning tests that aren't working well for us. Since there seems
>> to be
>> a strong interest in just keeping make check around as-is.
>>
>> I don't know enough of these tests, otherwise I would offer to start
>> helping here. In the case of ceph-disk, I think in *master* they could
>> be removed from make check entirely
>> and rely on ad-hoc ceph-disk testing when targetted PRs show up. That
>> would reduce a chunk of time that is spent on setting up the ceph-disk
>> test environment.
>
> I don't think anybody in the project any more knows about those tests
> much

Then what is the value if no one knows about them? What is the purpose
of a test if it fails and it doesn't tells us why?

>. I'd recommend just creating bugs for non-deterministic tests
> when you run across them and we can start working our way through them
> as a group to make our tests more useful overall.

That is kind of my issue here: I don't know. Some of them look
non-deterministic enough to raise the issue

> (Just anecdotally, I see failures from machine disconnects or whatever
> a lot more often than issues in "make check". But I don't do enough
> with it to run statistics.)

Sure, like I said, environmental issues are fine, we should retrigger
at will and try to re-run them again

>
> As for turning them off for ceph-disk...oh, I think I misunderstood
> what you were proposing. Are you saying those are some of the noisy
> tests? And as we move to ceph-volume there's little point testing
> ceph-disk in master?

ceph-disk tests are tied into make check, and in master I don't see a
need, as those can be run ad-hoc as needed, not every time on every
pull request.

I guess the ceph-disk thing is more of a corollary to the more generic
comment on why I think the check is not robust enough and keeps
failing even when code is not affecting it.

> -Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html