On Thu 11-01-18 23:11:12, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Wed 10-01-18 22:37:52, Tetsuo Handa wrote: > > > Michal Hocko wrote: > > > > On Wed 10-01-18 20:49:56, Tetsuo Handa wrote: > > > > > Tetsuo Handa wrote: > > > > > > I can hit this bug with Linux 4.11 and 4.8. (i.e. at least all 4.8+ have this bug.) > > > > > > So far I haven't hit this bug with Linux 4.8-rc3 and 4.7. > > > > > > Does anyone know what is happening? > > > > > > > > > > I simplified the reproducer and succeeded to reproduce this bug with both > > > > > i7-2630QM (8 core) and i5-4440S (4 core). Thus, I think that this bug is > > > > > not architecture specific. > > > > > > > > Can you see the same with 64b kernel? > > > > > > No. I can hit this bug with only x86_32 kernels. > > > But if the cause is not specific to 32b, this might be silent memory corruption. > > > > > > > It smells like a ref count imbalance and premature page free to me. Can > > > > you try to bisect this? > > > > > > Too difficult to bisect, but at least I can hit this bug with 4.8+ kernels. > > The bug in 4.8 kernel might be different from the bug in 4.15-rc7 kernel. > 4.15-rc7 kernel hits the bug so trivially. Maybe you want to disable the oom reaper to reduce chances of some issue there. -- Michal Hocko SUSE Labs