Hi, I'm seeing random application crashes (SIGSEV) and after a few minutes this appears in the logfiles: [133933.729199] /build/linux-lts-wily-4x6IId/linux-lts-wily-4.2.0/mm/pgtable-generic.c:33: bad pmd ffff880fd06d6200(000000018da009e2) [133933.763015] BUG: Bad rss-counter state mm:ffff88101705f800 idx:1 val:512 [133933.763039] BUG: non-zero nr_ptes on freeing mm: 1 I'm quite certain that it's not a hardware error. The problems appears regularly on random machines of a 100+ machine cluster of Dell PowerEdge R720 servers with 2xXeon E5 (NUMA) and 64GB ECC Memory. The workload is mostly Hadoop YARN with MapReduce and Spark, the JVM (mostly from the DataNodes) crashes randomly under load with SIGSEV. The problems appears with Kernel 4.3.0 and 4.2.7 from Ubuntu Kernel Mainline PPA[1] and with the current 4.2 Ubuntu Wily Kernel - all of these kernels already have a related patch[2]. However I'm still seeing the problem. The bug disappears when I disable transparent hugepages and reboot the machines! Before disabling transparent hugepages completely I ran this config: echo always > /sys/kernel/mm/transparent_hugepage/enabled echo never > /sys/kernel/mm/transparent_hugepage/defrag Unfortunately I can't provide any more data at the moment. Maybe I'm able to compile a kernel with debug options turned on over the holidays - if you have any hints where I can help to pin this down please tell me. On IRC CONFIG_DEBUG_VM was recommend. regards and thanks Martin 1: http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=M;O=D 2: https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable.git/+/47aee4d8e314384807e98b67ade07f6da476aa75 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>