Re: BUG: Bad page state in process ceph-osd pfn:111ce00

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a kernel error.

The ceph userspace code is extremely unlikely to be responsible.

Regardless, this needs to be debugged by whoever supports this kernel
as a first step.

A google search for "page dumped because: PAGE_FLAGS_CHECK_AT_PREP
flag set" shows similar bugs which may give some guidance.


On Tue, Jun 6, 2017 at 1:17 AM, Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote:
> Hello, we have received this today after months of running without any issues.
>
> ceph version 0.94.9 (fe6d859066244b97b24f09d46552afc2071e6f90)
>
> Running Ubuntu 14.04 with kernel 4.10.2-041002-generic
>
>
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348529] BUG: Bad page
> state in process ceph-osd  pfn:111ce00
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348550]
> page:fffff1c544738000 count:0 mapcount:0 mapping:          (null)
> index:0x1
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348568] flags:
> 0x57ffffc0000080(waiters)
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348581] raw:
> 0057ffffc0000080 0000000000000000 0000000000000001 00000000ffffffff
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348597] raw:
> dead000000000100 dead000000000200 0000000000000000 0000000000000000
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348613] page dumped
> because: PAGE_FLAGS_CHECK_AT_PREP flag set
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348626] bad because of
> flags: 0x80(waiters)
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348636] Modules linked
> in: mptctl mptbase ipmi_devintf intel_rapl sb_edac edac_core
> x86_pkg_temp_thermal intel_powerclamp ipmi_ssif xfs libcrc32c coretemp
> kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc
> aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate
> intel_rapl_perf joydev input_leds mei_me lpc_ich mei ioatdma shpchp
> ipmi_si ipmi_msghandler 8021q garp mrp wmi stp llc mac_hid bonding lp
> parport mlx4_en ses enclosure hid_generic usbhid hid mlx4_core igb
> mpt3sas isci i2c_algo_bit ahci devlink dca libsas ptp raid_class
> libahci megaraid_sas pps_core scsi_transport_sas fjes
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348685] CPU: 30 PID:
> 9348 Comm: ceph-osd Tainted: G        W       4.10.2-041002-generic
> #201703120131
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348687] Hardware name:
> Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2
> 03/04/2015
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348688] Call Trace:
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348696]  dump_stack+0x63/0x81
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348701]  bad_page+0xc1/0x120
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348703]
> check_new_page_bad+0x67/0x80
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348705]
> get_page_from_freelist+0x7ea/0xb20
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348707]
> __alloc_pages_slowpath+0x1fa/0xba0
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348709]  ?
> get_page_from_freelist+0x46a/0xb20
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348712]  ?
> timerqueue_del+0x24/0x70
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348716]  ?
> __remove_hrtimer+0x3c/0x70
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348718]
> __alloc_pages_nodemask+0x209/0x260
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348723]
> migrate_misplaced_transhuge_page+0x9a/0x870
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348725]
> do_huge_pmd_numa_page+0x21f/0x4e0
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348729]
> handle_mm_fault+0x612/0x1350
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348733]
> __do_page_fault+0x23e/0x4e0
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348736]  do_page_fault+0x22/0x30
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348741]  page_fault+0x28/0x30
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348743] RIP: 0033:0x7f9ad433360d
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348744] RSP:
> 002b:00007f9aaf72bba0 EFLAGS: 00010202
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348746] RAX:
> 00007f9ad4563020 RBX: 0000000003916ed8 RCX: 00000000000000aa
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348747] RDX:
> 0000000003916ed8 RSI: 0000000003ba1da8 RDI: 00007f9ad4562fe0
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348748] RBP:
> 00007f9ad4562fe0 R08: ffffffffffffffff R09: 00007f9aaf72bce8
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348750] R10:
> 00007f9ad292afe0 R11: 0000000000000000 R12: 0000000018d48880
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348751] R13:
> 0000000027197980 R14: 00007f9aaf72bc18 R15: 0000000000000099
> Jun  5 11:08:36 roc02r-sca040 kernel: [7126162.348753] Disabling lock
> debugging due to kernel taint
> Jun  5 11:17:46 roc02r-sca040 kernel: [7126712.363404] mpt2sas_cm0:
> log_info(0x30030101): originator(IOP), code(0x03), sub_code(0x0101)
> root@roc02r-sca040:/var/log#
>
> --
> Alex Gorbachev
> Storcium
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux