On Sun, Jan 6, 2019 at 2:47 PM Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > On 2019/01/06 22:24, Dmitry Vyukov wrote: > >> A report at 2019/01/05 10:08 from "no output from test machine (2)" > >> ( https://syzkaller.appspot.com/text?tag=CrashLog&x=1700726f400000 ) > >> says that there are flood of memory allocation failure messages. > >> Since continuous memory allocation failure messages itself is not > >> recognized as a crash, we might be misunderstanding that this problem > >> is not occurring recently. It will be nice if we can run testcases > >> which are executed on bpf-next tree. > > > > What exactly do you mean by running test cases on bpf-next tree? > > syzbot tests bpf-next, so it executes lots of test cases on that tree. > > One can also ask for patch testing on bpf-next tree to test a specific > > test case. > > syzbot ran "some tests" before getting this report, but we can't find from > this report what the "some tests" are. If we could record all tests executed > in syzbot environments before getting this report, we could rerun the tests > (with manually examining where the source of memory consumption is) in local > environments. Filed https://github.com/google/syzkaller/issues/917 for this. > Since syzbot is now using memcg, maybe we can test with sysctl_panic_on_oom == 1. > Any memory consumption that triggers global OOM killer could be considered as > a problem (e.g. memory leak or uncontrolled memory allocation). Interesting idea. This will also alleviate the previous problem as I think only a stream of OOMs currently produces 1+MB of output. +Shakeel who was interested in catching more memcg-escaping allocations. To do this we need a buy-in from kernel community to consider this as a bug/something to fix in kernel. Systematic testing can't work gray checks requiring humans to look at each case and some cases left as being working-as-intended. There are also 2 interesting points: - testing of kernel without memcg-enabled (some kernel users obviously do this); it's doable, but currently syzkaller have no precedents/infrastructure to consider some output patterns as bugs or not depending on kernel features - false positives for minimized C reproducers that have memcg code stripped off (people complain that reproducers are too large/complex otherwise)