On Thu, Jun 6, 2019 at 4:54 PM Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> wrote: > > On 06.06.2019 17:40, Dmitry Vyukov wrote: > > On Thu, Jun 6, 2019 at 3:43 PM Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> wrote: > >> > >> On 06.06.2019 16:13, J. Bruce Fields wrote: > >>> On Thu, Jun 06, 2019 at 10:47:43AM +0300, Kirill Tkhai wrote: > >>>> This may be connected with that shrinker unregistering is forgotten on error path. > >>> > >>> I was wondering about that too. Seems like it would be hard to hit > >>> reproduceably though: one of the later allocations would have to fail, > >>> then later you'd have to create another namespace and this time have a > >>> later module's init fail. > >> > >> Yes, it's had to bump into this in real life. > >> > >> AFAIU, syzbot triggers such the problem by using fault-injections > >> on allocation places should_failslab()->should_fail(). It's possible > >> to configure a specific slab, so the allocations will fail with > >> requested probability. > > > > No fault injection was involved in triggering of this bug. > > Fault injection is clearly visible in console log as "INJECTING > > FAILURE at this stack track" splats and also for bugs with repros it > > would be noted in the syzkaller repro as "fault_call": N. So somehow > > this bug was triggered as is. > > > > But overall syzkaller can do better then the old probabilistic > > injection. The probabilistic injection tend to both under-test what we > > want to test and also crash some system services. syzkaller uses the > > new "systematic fault injection" that allows to test specifically each > > failure site separately in each syscall separately. > > Oho! Interesting. If you are interested. You write N into /proc/thread-self/fail-nth (say, 5) then it will cause failure of the N-th (5-th) failure site in the next syscall in this task only. And by reading it back after the syscall you can figure out if the failure was indeed injected or not (or the syscall had less than 5 failure sites). Then, for each syscall in a test (or only for one syscall of interest), we start by writing "1" into /proc/thread-self/fail-nth; if the failure was injected, write "2" and restart the test; if the failure was injected, write "3" and restart the test; and so on, until the failure wasn't injected (tested all failure sites). This guarantees systematic testing of each error path with minimal number of runs. This has obvious extensions to "each pair of failure sites" (to test failures on error paths), but it's not supported atm.