Re: Kernel Bug in 3.13.0-52

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Daniel,

There are some kernel recommendations here, although it's unclear if they only apply to RBD or also to CephFS.

--Lincoln

On May 13, 2015, at 3:03 PM, Daniel Takatori Ohara wrote:

Thank Gregory for the answer.

I will be upgrade the kernel.

Do you know what kernel the CephFS is stable?

Thanks.


Att.

---
Daniel Takatori Ohara.
System Administrator - Lab. of Bioinformatics
Molecular Oncology Center 
Instituto S�o-Liban�de Ensino e Pesquisa
Hospital S�o-Liban�/div>
Phone: +55 11 3155-0200 (extension 1927)
R: Cel. Nicolau dos Santos, 69
S�Paulo-SP. 01308-060


On Wed, May 13, 2015 at 5:01 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
On Wed, May 13, 2015 at 12:08 PM, Daniel Takatori Ohara
<dtohara@xxxxxxxxxxxxx> wrote:
> Hi,
>
> We have a small ceph cluster with 4 OSD's and 1 MDS.
>
> I run Ubuntu 14.04 with 3.13.0-52-generic in the clients, and CentOS 6.6
> with 2.6.32-504.16.2.el6.x86_64 in Servers.
>
> The version of Ceph is 0.94.1
>
> Sometimes, the CephFS freeze, and the dmesg show me the follow messages :
>
> May 13 15:53:10 blade02 kernel: [93297.784094] ------------[ cut here
> ]------------
> May 13 15:53:10 blade02 kernel: [93297.784121] WARNING: CPU: 10 PID: 299 at
> /build/buildd/linux-3.13.0/fs/ceph/inode.c:701 fill_inode.isra.8+0x9ed/0xa00
> [ceph]()
> May 13 15:53:10 blade02 kernel: [93297.784129] Modules linked in: 8021q garp
> stp mrp llc nfsv3 rpcsec_gss_krb5 nfsv4 ceph libceph libcrc32c intel_rapl
> x86_pkg_temp_thermal intel_powerclamp ipmi_devintf gpi
> May 13 15:53:10 blade02 kernel: [93297.784204] CPU: 10 PID: 299 Comm:
> kworker/10:1 Tainted: G        W     3.13.0-52-generic #86-Ubuntu
> May 13 15:53:10 blade02 kernel: [93297.784207] Hardware name: Dell Inc.
> PowerEdge M520/050YHY, BIOS 2.1.3 01/20/2014
> May 13 15:53:10 blade02 kernel: [93297.784221] Workqueue: ceph-msgr con_work
> [libceph]
> May 13 15:53:10 blade02 kernel: [93297.784225]  0000000000000009
> ffff880801093a28 ffffffff8172266e 0000000000000000
> May 13 15:53:10 blade02 kernel: [93297.784233]  ffff880801093a60
> ffffffff810677fd 00000000ffffffea 0000000000000036
> May 13 15:53:10 blade02 kernel: [93297.784239]  0000000000000000
> 0000000000000000 ffffc9001b73f9d8 ffff880801093a70
> May 13 15:53:10 blade02 kernel: [93297.784246] Call Trace:
> May 13 15:53:10 blade02 kernel: [93297.784257]  [<ffffffff8172266e>]
> dump_stack+0x45/0x56
> May 13 15:53:10 blade02 kernel: [93297.784264]  [<ffffffff810677fd>]
> warn_slowpath_common+0x7d/0xa0
> May 13 15:53:10 blade02 kernel: [93297.784269]  [<ffffffff810678da>]
> warn_slowpath_null+0x1a/0x20
> May 13 15:53:10 blade02 kernel: [93297.784280]  [<ffffffffa046facd>]
> fill_inode.isra.8+0x9ed/0xa00 [ceph]
> May 13 15:53:10 blade02 kernel: [93297.784290]  [<ffffffffa046e3cd>] ?
> ceph_alloc_inode+0x1d/0x4e0 [ceph]
> May 13 15:53:10 blade02 kernel: [93297.784302]  [<ffffffffa04704cf>]
> ceph_readdir_prepopulate+0x27f/0x6d0 [ceph]
> May 13 15:53:10 blade02 kernel: [93297.784318]  [<ffffffffa048a704>]
> handle_reply+0x854/0xc70 [ceph]
> May 13 15:53:10 blade02 kernel: [93297.784331]  [<ffffffffa048c3f7>]
> dispatch+0xe7/0xa90 [ceph]
> May 13 15:53:10 blade02 kernel: [93297.784342]  [<ffffffffa02a4a78>] ?
> ceph_tcp_recvmsg+0x48/0x60 [libceph]
> May 13 15:53:10 blade02 kernel: [93297.784354]  [<ffffffffa02a7a9b>]
> try_read+0x4ab/0x10d0 [libceph]
> May 13 15:53:10 blade02 kernel: [93297.784365]  [<ffffffffa02a9418>] ?
> try_write+0x9a8/0xdb0 [libceph]
> May 13 15:53:10 blade02 kernel: [93297.784373]  [<ffffffff8101bc23>] ?
> native_sched_clock+0x13/0x80
> May 13 15:53:10 blade02 kernel: [93297.784379]  [<ffffffff8109d585>] ?
> sched_clock_cpu+0xb5/0x100
> May 13 15:53:10 blade02 kernel: [93297.784390]  [<ffffffffa02a98d9>]
> con_work+0xb9/0x640 [libceph]
> May 13 15:53:10 blade02 kernel: [93297.784398]  [<ffffffff81083aa2>]
> process_one_work+0x182/0x450
> May 13 15:53:10 blade02 kernel: [93297.784403]  [<ffffffff81084891>]
> worker_thread+0x121/0x410
> May 13 15:53:10 blade02 kernel: [93297.784409]  [<ffffffff81084770>] ?
> rescuer_thread+0x430/0x430
> May 13 15:53:10 blade02 kernel: [93297.784414]  [<ffffffff8108b5d2>]
> kthread+0xd2/0xf0
> May 13 15:53:10 blade02 kernel: [93297.784420]  [<ffffffff8108b500>] ?
> kthread_create_on_node+0x1c0/0x1c0
> May 13 15:53:10 blade02 kernel: [93297.784426]  [<ffffffff817330cc>]
> ret_from_fork+0x7c/0xb0
> May 13 15:53:10 blade02 kernel: [93297.784431]  [<ffffffff8108b500>] ?
> kthread_create_on_node+0x1c0/0x1c0
> May 13 15:53:10 blade02 kernel: [93297.784434] ---[ end trace
> 05d3f5ee1f31bc67 ]---
> May 13 15:53:10 blade02 kernel: [93297.784437] ceph: fill_inode badness on
> ffff8807f7eaa5c0

I don't follow the kernel stuff too closely, but the CephFS kernel
client is still improving quite rapidly and 3.13 is old at this point.
You could try upgrading to something newer.
Zheng might also know what's going on and if it's been fixed.
-Greg

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux