This is a kernel error. The ceph userspace code is extremely unlikely to be responsible. Regardless, this needs to be debugged by whoever supports this kernel as a first step. A google search for "page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set" shows similar bugs which may give some guidance. On Tue, Jun 6, 2017 at 1:17 AM, Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote: > Hello, we have received this today after months of running without any issues. > > ceph version 0.94.9 (fe6d859066244b97b24f09d46552afc2071e6f90) > > Running Ubuntu 14.04 with kernel 4.10.2-041002-generic > > > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348529] BUG: Bad page > state in process ceph-osd pfn:111ce00 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348550] > page:fffff1c544738000 count:0 mapcount:0 mapping: (null) > index:0x1 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348568] flags: > 0x57ffffc0000080(waiters) > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348581] raw: > 0057ffffc0000080 0000000000000000 0000000000000001 00000000ffffffff > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348597] raw: > dead000000000100 dead000000000200 0000000000000000 0000000000000000 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348613] page dumped > because: PAGE_FLAGS_CHECK_AT_PREP flag set > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348626] bad because of > flags: 0x80(waiters) > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348636] Modules linked > in: mptctl mptbase ipmi_devintf intel_rapl sb_edac edac_core > x86_pkg_temp_thermal intel_powerclamp ipmi_ssif xfs libcrc32c coretemp > kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc > aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate > intel_rapl_perf joydev input_leds mei_me lpc_ich mei ioatdma shpchp > ipmi_si ipmi_msghandler 8021q garp mrp wmi stp llc mac_hid bonding lp > parport mlx4_en ses enclosure hid_generic usbhid hid mlx4_core igb > mpt3sas isci i2c_algo_bit ahci devlink dca libsas ptp raid_class > libahci megaraid_sas pps_core scsi_transport_sas fjes > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348685] CPU: 30 PID: > 9348 Comm: ceph-osd Tainted: G W 4.10.2-041002-generic > #201703120131 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348687] Hardware name: > Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 > 03/04/2015 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348688] Call Trace: > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348696] dump_stack+0x63/0x81 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348701] bad_page+0xc1/0x120 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348703] > check_new_page_bad+0x67/0x80 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348705] > get_page_from_freelist+0x7ea/0xb20 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348707] > __alloc_pages_slowpath+0x1fa/0xba0 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348709] ? > get_page_from_freelist+0x46a/0xb20 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348712] ? > timerqueue_del+0x24/0x70 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348716] ? > __remove_hrtimer+0x3c/0x70 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348718] > __alloc_pages_nodemask+0x209/0x260 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348723] > migrate_misplaced_transhuge_page+0x9a/0x870 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348725] > do_huge_pmd_numa_page+0x21f/0x4e0 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348729] > handle_mm_fault+0x612/0x1350 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348733] > __do_page_fault+0x23e/0x4e0 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348736] do_page_fault+0x22/0x30 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348741] page_fault+0x28/0x30 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348743] RIP: 0033:0x7f9ad433360d > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348744] RSP: > 002b:00007f9aaf72bba0 EFLAGS: 00010202 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348746] RAX: > 00007f9ad4563020 RBX: 0000000003916ed8 RCX: 00000000000000aa > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348747] RDX: > 0000000003916ed8 RSI: 0000000003ba1da8 RDI: 00007f9ad4562fe0 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348748] RBP: > 00007f9ad4562fe0 R08: ffffffffffffffff R09: 00007f9aaf72bce8 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348750] R10: > 00007f9ad292afe0 R11: 0000000000000000 R12: 0000000018d48880 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348751] R13: > 0000000027197980 R14: 00007f9aaf72bc18 R15: 0000000000000099 > Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348753] Disabling lock > debugging due to kernel taint > Jun 5 11:17:46 roc02r-sca040 kernel: [7126712.363404] mpt2sas_cm0: > log_info(0x30030101): originator(IOP), code(0x03), sub_code(0x0101) > root@roc02r-sca040:/var/log# > > -- > Alex Gorbachev > Storcium > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Cheers, Brad _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com