Ilya, I can gather the following syslog entries. Attached is the syslog..Please have a look if this is helpful. I can see the following trace.. Dec 9 01:38:01 rack1-ramp-5 kernel: [1371757.283268] Workqueue: ceph-msgr con_work [libceph] Dec 9 01:38:01 rack1-ramp-5 kernel: [1371757.291641] task: ffff880fb6868000 ti: ffff880ffaa2a000 task.ti: ffff880ffaa2a000 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371757.304503] RIP: 0010:[<ffffffffa035a40e>] [<ffffffffa035a40e>] osd_reset+0x22e/0x2c0 [libceph] Dec 9 01:38:01 rack1-ramp-5 kernel: [1371757.319808] RSP: 0018:ffff880ffaa2bd80 EFLAGS: 00010206 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371757.328659] RAX: ffff881012fb4ca8 RBX: ffff8810114a9750 RCX: ffff881012790050 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371757.599331] RDX: ffff881012fb4ca8 RSI: 0000000086588656 RDI: 0000000000000286 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371757.703539] RBP: ffff880ffaa2bdd8 R08: 0000000000000000 R09: 0000000000000000 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371757.810053] R10: ffffffff81600edf R11: ffffea003fef7a00 R12: ffff881012fb4c58 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371757.918811] R13: ffff8810114a9810 R14: ffff881012790000 R15: ffff881012790020 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029661] libceph: osd32 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029662] libceph: osd33 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029662] libceph: osd38 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029662] libceph: osd39 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029663] libceph: osd40 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029663] libceph: osd47 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029663] libceph: osd48 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029663] libceph: osd49 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029664] libceph: osd50 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029664] libceph: osd51 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029664] libceph: osd52 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029665] libceph: osd53 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.029665] libceph: osd57 down Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.631655] FS: 0000000000000000(0000) GS:ffff88101f300000(0000) knlGS:0000000000000000 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.700074] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.734306] CR2: 00007f0bbad49000 CR3: 0000000001c0e000 CR4: 00000000001407e0 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.800693] Stack: Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.832457] ffff8810114a97a8 ffff8810114a9760 ffff881012fb4800 ffff881012fb4ca8 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.897340] ffff880ffaa2bda0 ffff880ffaa2bda0 ffff881012fb4c10 ffff881012fb4830 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371758.962318] ffff881012fb49b0 ffff881012fb4860 0000000000000011 ffff880ffaa2be20 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.027390] Call Trace: Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.058230] [<ffffffffa03549e8>] con_work+0x298/0x640 [libceph] Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.089619] [<ffffffff810838a2>] process_one_work+0x182/0x450 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.120139] [<ffffffff81084641>] worker_thread+0x121/0x410 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.149533] [<ffffffff81084520>] ? rescuer_thread+0x3e0/0x3e0 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.179041] [<ffffffff8108b312>] kthread+0xd2/0xf0 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.209159] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.240921] [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.273511] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.307636] Code: ff ff 48 89 df e8 e3 f1 ff ff 48 8b 7d a8 e8 7a 1c 3c e1 48 8b 7d b0 e8 41 68 d5 e0 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 48 8b 45 b8 49 8b 0e 4c 89 f2 48 c7 c6 d0 e6 36 a0 48 c7 Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.421674] RIP [<ffffffffa035a40e>] osd_reset+0x22e/0x2c0 [libceph] Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.462127] RSP <ffff880ffaa2bd80> Dec 9 01:38:01 rack1-ramp-5 kernel: [1371759.567952] ---[ end trace 37d00d439ac66995 ]--- Dec 9 01:38:17 rack1-ramp-5 kernel: [1371759.614230] BUG: unable to handle kernel paging request at ffffffffffffffd8 Dec 9 01:38:17 rack1-ramp-5 kernel: [1371759.659349] IP: [<ffffffff8108b9b0>] kthread_data+0x10/0x20 Thanks & Regards Somnath -----Original Message----- From: Somnath Roy Sent: Monday, January 05, 2015 1:08 PM To: 'Ilya Dryomov' Cc: Chaitanya Huilgol; ceph-devel@xxxxxxxxxxxxxxx Subject: RE: Ceph-client branch for Ubuntu 14.04.1 LTS (3.13.0-x kernels) It's happening both in idle and under load. I don't have the trace right now but will get you one soon. Thanks & Regards Somnath -----Original Message----- From: Ilya Dryomov [mailto:ilya.dryomov@xxxxxxxxxxx] Sent: Monday, January 05, 2015 12:34 PM To: Somnath Roy Cc: Chaitanya Huilgol; ceph-devel@xxxxxxxxxxxxxxx Subject: Re: Ceph-client branch for Ubuntu 14.04.1 LTS (3.13.0-x kernels) On Mon, Jan 5, 2015 at 11:01 PM, Somnath Roy <Somnath.Roy@xxxxxxxxxxx> wrote: > Ilya, > Here is the steps.. > > 1. You have a cluster (3 nodes) and replication is 3 > > 2. map krbd image to a client. > > 3. Reboot or stop ceph services on one or more nodes > > 4. The client with krbd mapped module crashes Is it idle or under load? Do you have a trace of the crash? Thanks, Ilya ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
Attachment:
syslog.tar.gz
Description: syslog.tar.gz