On Sun, Jan 14, 2018 at 3:54 AM, Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > This memory corruption bug occurs even on CONFIG_SMP=n CONFIG_PREEMPT_NONE=y > kernel. This bug highly depends on timing and thus too difficult to bisect. > This bug seems to exist at least since Linux 4.8 (judging from the traces, though > the cause might be different). None of debugging configuration gives me a clue. > So far only CONFIG_HIGHMEM=y CONFIG_DEBUG_PAGEALLOC=y kernel (with RAM enough to > use HighMem: zone) seems to hit this bug, but it might be just by chance caused > by timings. Thus, there is no evidence that 64bit kernels are not affected by > this bug. But I can't narrow down any more. Thus, I call for developers who can > narrow down / identify where the memory corruption bug is. Hmm. I guess I'm still hung up on the "it does not look like a valid 'struct page *'" thing. Can you reproduce this with CONFIG_FLATMEM=y instead of CONFIG_SPARSEMEM? Because if you can, I think we can easily add a few more pfn and 'struct page' validation debug statements. With SPARSEMEM, it gets pretty complicated because the whole struct page setup is much more complex. Linus