xfs crash & mount hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



we meet a mount hang in linux kernel, hoping someone could give some suggests..

we create a flashcache dm disk with writearound mode,and it was used by ceph-osd data disk(not journal),
ceph-disk mkfs.xfs on it. today the host got into an unexpected power outage, when re-power on ,we found
that ceph-osd got an IO error when try to open leveldb's sst file on the disk, then the kernel crash:

2016-10-12 09:01:25.451747 7ff714151780 -1 filestore(/var/lib/ceph/osd/ceph-3) Error initializing leveldb : IO error: /var/lib/ceph/osd/ceph-3/current/omap/000059.sst: Structure needs cleaning`

Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674730] XFS (dm-2): Internal error XFS_WANT_CORRUPTED_GOTO at line 1600 of file fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_free_extent+0xea/0x950 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674733] CPU: 10 PID: 7909 Comm: ceph-osd Tainted: G           OE   4.1.0-generic #1
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674734] Hardware name: Hangzhou H3C Technologies Co., Ltd. UIS R390X G2/UIS R390X G2, BIOS 1.00.13 08/26/2016
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674736]  0000000000000000 ffff88202dbef848 ffffffff817ee563 ffff88202da54658
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674737]  ffff882033ef2000 ffff88202dbef868 ffffffffc034c49f ffffffffc0308d5a
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674739]  0000000000000000 ffff88202dbef918 ffffffffc0305cd9 ffff882034585c80
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674739] Call Trace:
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674745]  [<ffffffff817ee563>] dump_stack+0x63/0x81
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674754]  [<ffffffffc034c49f>] xfs_error_report+0x3f/0x50 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674758]  [<ffffffffc0308d5a>] ? xfs_free_extent+0xea/0x950 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674761]  [<ffffffffc0305cd9>] xfs_alloc_log_agf+0x399/0x8d0 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674765]  [<ffffffffc0308d5a>] xfs_free_extent+0xea/0x950 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674772]  [<ffffffffc03456a5>] xfs_bmap_finish+0x155/0x1a0 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674780]  [<ffffffffc035cdf9>] xfs_itruncate_extents+0xf9/0x400 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674787]  [<ffffffffc03598dd>] xfs_setattr_size+0x36d/0x4a0 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674792]  [<ffffffffc0359a0c>] xfs_setattr_size+0x49c/0x4a0 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674813]  [<ffffffff8121f0c7>] notify_change+0x1d7/0x3e0
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674816]  [<ffffffff81200870>] do_truncate+0x60/0xa0
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674818]  [<ffffffff81210edd>] do_last+0xbfd/0xfd0
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674819]  [<ffffffff8121133f>] path_openat+0x8f/0x620
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674826]  [<ffffffffc034f16b>] ? xfs_zero_eof+0x39b/0xc10 [xfs]
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674828]  [<ffffffff812139b3>] do_filp_open+0x43/0xa0
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674830]  [<ffffffff8122044d>] ? __alloc_fd+0xcd/0x120
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674832]  [<ffffffff81201a12>] do_sys_open+0x132/0x230
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674833]  [<ffffffff81201b2e>] SyS_open+0x1e/0x20
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674837]  [<ffffffff817f65f2>] system_call_fastpath+0x16/0x75
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674840] XFS (dm-2): xfs_do_force_shutdown(0x8) called from line 135 of file fs/xfs/xfs_bmap_util.c.  Return address = 0xffffffffc03456de
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674999] XFS (dm-2): Corruption of in-memory data detected.  Shutting down filesystem
Oct 12 09:01:25 NHGO-CVK01 kernel: [  161.674999] XFS (dm-2): Please umount the filesystem and rectify the problem(s)
Oct 12 09:01:25 NHGO-CVK01 kdump-tools: /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.1.0-generic root=UUID=b7325a68-7af2-45d9-904b-e1d47ad9236b ro ramdisk_size=2048000 nomodeset elevator=deadline transparent_hugepage=always quiet irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-4.1.0-generic /boot/vmlinuz-4.1.0-generic
Oct 12 09:01:25 NHGO-CVK01 kdump-tools: loaded kdump kernel

when reboot, udev call ceph-disk try to mount the disk for ceph-osd ,but failed again.

Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.274696] XFS (sdd1): Mounting V4 Filesystem
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.423624] XFS (sdd1): Starting recovery (logdev: internal)
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550061] XFS (sdd1): Internal error XFS_WANT_CORRUPTED_GOTO at line 1600 of file fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_free_extent+0xea/0x950 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550064] CPU: 9 PID: 2733 Comm: mount Tainted: G            E   4.1.0-generic #1
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550065] Hardware name: Hangzhou H3C Technologies Co., Ltd. UIS R390X G2/UIS R390X G2, BIOS 1.00.13 08/26/2016
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550068]  0000000000000000 ffff88202c86b9d8 ffffffff817ee563 ffff88202d8dc000
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550069]  ffff882034a900d0 ffff88202c86b9f8 ffffffffc033d49f ffffffffc02f9d5a
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550070]  0000000000000000 ffff88202c86baa8 ffffffffc02f6cd9 ffff882033691680
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550071] Call Trace:
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550080]  [<ffffffff817ee563>] dump_stack+0x63/0x81
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550106]  [<ffffffffc033d49f>] xfs_error_report+0x3f/0x50 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550110]  [<ffffffffc02f9d5a>] ? xfs_free_extent+0xea/0x950 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550113]  [<ffffffffc02f6cd9>] xfs_alloc_log_agf+0x399/0x8d0 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550116]  [<ffffffffc02f9d5a>] xfs_free_extent+0xea/0x950 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550124]  [<ffffffffc0361d70>] xfs_inode_item_format_convert+0x240/0x60f0 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550130]  [<ffffffffc0363416>] xfs_inode_item_format_convert+0x18e6/0x60f0 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550135]  [<ffffffffc0367d3a>] xlog_recover_finish+0x2a/0x150 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550142]  [<ffffffffc035b6d4>] xfs_log_mount_finish+0x34/0x50 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550149]  [<ffffffffc035263c>] xfs_mountfs+0x4bc/0x750 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550156]  [<ffffffffc03404b0>] ? xfs_zero_eof+0x6e0/0xc10 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550163]  [<ffffffffc03555d8>] xfs_mru_cache_done+0x1e38/0x21f0 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550167]  [<ffffffff81205ae9>] mount_bdev+0x1b9/0x200
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550172]  [<ffffffffc0355250>] ? xfs_mru_cache_done+0x1ab0/0x21f0 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550177]  [<ffffffffc0353805>] xfs_mru_cache_done+0x65/0x21f0 [xfs]
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550179]  [<ffffffff81206693>] mount_fs+0x43/0x180
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550183]  [<ffffffff81222666>] vfs_kern_mount+0x76/0x140
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550185]  [<ffffffff812248ed>] do_mount+0x20d/0xcb0
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550186]  [<ffffffff8122458a>] ? copy_mount_options+0x3a/0x170
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550187]  [<ffffffff812256fb>] SyS_mount+0x7b/0xc0
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550192]  [<ffffffff817f65f2>] system_call_fastpath+0x16/0x75
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550206] XFS (sdd1): Failed to recover EFIs
Oct 12 10:18:14 NHGO-CVK01 kernel: [    6.550207] XFS (sdd1): log mount finish failed

after the kdump start ,the mount hang endless...

Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.293683] INFO: task mount:2733 blocked for more than 120 seconds.
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.329959]       Tainted: G           OE   4.1.0-generic #1
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.366758] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443134] mount           D ffff88202c86bb68     0  2733      1 0x00000000
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443137]  ffff88202c86bb68 ffff88103675c800 ffff882035bb5280 ffff881038d0da90
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443139]  ffff882035686eb0 0000000000000246 ffff88202c86c000 ffff88202feb5040
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443141]  ffff88202feb5068 ffff88202feb5010 ffff88202fedc000 ffff88202c86bb88
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443142] Call Trace:
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443152]  [<ffffffff817f26c7>] schedule+0x37/0x90
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443168]  [<ffffffffc0368978>] xfs_ail_push_all_sync+0x98/0xd0 [xfs]
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443175]  [<ffffffff810c10a0>] ? prepare_to_wait_event+0x100/0x100
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443182]  [<ffffffffc035d927>] xfs_log_quiesce+0x37/0x70 [xfs]
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443188]  [<ffffffffc035d981>] xfs_log_unmount+0x21/0x70 [xfs]
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443195]  [<ffffffffc0352731>] xfs_mountfs+0x5b1/0x750 [xfs]
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443202]  [<ffffffffc03404b0>] ? xfs_zero_eof+0x6e0/0xc10 [xfs]
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443209]  [<ffffffffc03555d8>] xfs_mru_cache_done+0x1e38/0x21f0 [xfs]
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443213]  [<ffffffff81205ae9>] mount_bdev+0x1b9/0x200
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443219]  [<ffffffffc0355250>] ? xfs_mru_cache_done+0x1ab0/0x21f0 [xfs]
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443224]  [<ffffffffc0353805>] xfs_mru_cache_done+0x65/0x21f0 [xfs]
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443226]  [<ffffffff81206693>] mount_fs+0x43/0x180
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443231]  [<ffffffff81222666>] vfs_kern_mount+0x76/0x140
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443234]  [<ffffffff812248ed>] do_mount+0x20d/0xcb0
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443235]  [<ffffffff8122458a>] ? copy_mount_options+0x3a/0x170
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443237]  [<ffffffff812256fb>] SyS_mount+0x7b/0xc0
Oct 12 10:24:09 NHGO-CVK01 kernel: [  361.443241]  [<ffffffff817f65f2>] system_call_fastpath+0x16/0x75

best regards!
-------------------------------------------------------------------------------------------------------------------------------------
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!
��.n��������+%������w��{.n�����{�����jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux