On Tue, Mar 13, 2018 at 9:41 AM, John Spray <jspray@xxxxxxxxxx> wrote: > On Tue, Mar 13, 2018 at 1:23 PM, Alfredo Deza <adeza@xxxxxxxxxx> wrote: >> The current "make check" job on pull requests is configured to require >> a "passing/OK" state to allow a merge. >> >> Looking back at the past 100 builds since March 13th, there is roughly a 20% >> failure rate [0]. This is a similar failure rate for ceph-volume PRs which never >> hit any make check paths: 6 failures out of the last 25 ceph-volume >> pull requests have >> make check failures). >> >> These failures in make check means that we must almost always ignore them, and >> use administrator privilege to merge. This is far from ideal, and further >> reduces the confidence in the tests. > > I think we're mostly re-running them rather than ignoring them? > That's what I do. > > The make check condition is already de-facto optional because someone > with the right permissions can skip it -- this at least gives the > person merging a hoop to jump through, rather than making it easier to > just ignore make check failures. For me, the effort of doing a > "retest this please" and/or using force merge is the lesser of two > evils compared with making the make check officially optional. What would the purpose of a test be if we just need to re-run it to make it pass because sometimes fails for conditions that aren't accurate? Where is the value on tests like these? Again, I am fine with re-testing on environmental issues, but most of the failures are from tests that are imprecise. > > John > >> Some of the failures are produced by code that implies a grey area, enough to >> do a non-zero exit status: >> >> /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t: >> failed >> --- /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t >> +++ /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t.err >> @@ -40,6 +40,7 @@ >> # it is almost impossible to get the same stats with random and crush >> # if they are, it most probably means something went wrong somewhere >> $ test "$STATS_CRUSH" != "$STATS_RANDOM" >> + [1] >> # Ran 13 tests, 0 skipped, 1 failed. >> >> Without a doubt, we will hit environmental build issues, and re-triggering is >> fine, but then again, why are we having a *mandatory check* that has a high >> failure rate from incorrect assumptions? >> >> We've discussed the possibility of "gating" pull requests, and the make check >> tool has been suggested for this. I would love to see that happen at some >> point, but that will take significant effort on make check, or a separate tool >> that can start with higher-confidence checks. >> >> >> [0] https://jenkins.ceph.com/job/ceph-pull-requests/buildTimeTrend >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html