On Tue, Apr 14, 2020 at 9:28 PM Dan Rue <dan.rue@xxxxxxxxxx> wrote: > > On Tue, Apr 14, 2020 at 01:12:50PM +0200, Dmitry Vyukov wrote: > > On Tue, Apr 14, 2020 at 12:06 AM Qian Cai <cai@xxxxxx> wrote: > > > Well, there are other CI's beyond syzbot. > > > On the other hand, this makes me worry who is testing on linux-next every day. > > > > How do these use-after-free's and locking bugs get past the > > unit-testing systems (which syzbot is not) and remain unnoticed for so > > long?... > > syzbot uses the dumbest VMs (GCE), so everything it triggers during > > boot should be triggerable pretty much everywhere. > > It seems to be an action point for the testing systems. "Boot to ssh" > > is not the best criteria. Again if there is a LOCKDEP error, we are > > not catching any more LOCKDEP errors during subsequent testing. If > > there is a use-after-free, that's a serious error on its own and KASAN > > produces only 1 error by default as well. And as far as I understand, > > lots of kernel testing systems don't even enable KASAN, which is very > > wrong. > > I've talked to +Dan Rue re this few days ago. Hopefully LKFT will > > start catching these as part of unit testing. Which should help with > > syzbot testing as well. > > LKFT has recently added testing with KASAN enabled and improved the > kernel log parsing to catch more of this class of errors while > performing our regular functional testing. > > Incidentally, -next was also broken for us from March 25 through April 5 > due to a perf build failure[0], which eventually made itself all the way > down into v5.6 release and I believe the first two 5.6.x stable > releases. > > For -next, LKFT's gap is primarily reporting. We do build and run over > 30k tests on every -next daily release, but we send out issues manually > when we see them because triaging is still a manual effort. We're > working to build better automated reporting. If anyone is interested in > watching LKFT's -next results more closely (warning, it's a bit noisy), > please let me know. Watching the results at https://lkft.linaro.org > provides some overall health indications, but again, it gets pretty > difficult to figure out signal from noise once you start drilling down > without sufficient context of the system. What kind of failures and noise do you get? Is it flaky tests? I would assume build failures are ~0% flaky/noisy. And boot failures are maybe ~1% flaky/noisy due to some infra issues. I can't find any actual test failure logs in the UI. I've got to this page: https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v5.7-rc1-24-g8632e9b5645b/testrun/1363280/suite/kselftest/tests/ which seem to contain failed tests on mainline. But I still can't find the actual test failure logs. > Dan > > [0] https://lore.kernel.org/stable/CA+G9fYsZjmf34pQT1DeLN_DDwvxCWEkbzBfF0q2VERHb25dfZQ@xxxxxxxxxxxxxx/ > > -- > Linaro LKFT > https://lkft.linaro.org