ceph osd kernel divide error

Victor Payno <vpayno@xxxxxxxxxx> · Tue, 2 Aug 2016 11:36:46 -0700

On a node with osd.14 we got this kernel message.

[Sun Jul 31 01:06:01 2016] md: md127: data-check done.
[Tue Aug  2 11:15:58 2016] divide error: 0000 [#1] SMP
[Tue Aug  2 11:15:58 2016] Modules linked in: rbd libceph dns_resolver
xfs sg 8021q garp mrp x86_pkg_temp_thermal sb_edac edac_core ioatdma
ipmi_ssif tpm_tis ext4 mbcache jbd2 raid1 ixgbe dca crc32c_intel mdio
tg3 megaraid_sas
[Tue Aug  2 11:15:58 2016] CPU: 4 PID: 9319 Comm: ceph-osd Not tainted
4.4.12-vanilla-base-1 #1
[Tue Aug  2 11:15:58 2016] Hardware name: Dell Inc. PowerEdge
R730xd/0599V5, BIOS 1.3.6 06/03/2015
[Tue Aug  2 11:15:58 2016] task: ffff880036537080 ti: ffff88039d84c000
task.ti: ffff88039d84c000
[Tue Aug  2 11:15:58 2016] RIP: 0010:[<ffffffff81166b31>]
[<ffffffff81166b31>] task_numa_find_cpu+0x1b1/0x5f0
[Tue Aug  2 11:15:58 2016] RSP: 0000:ffff88039d84fc30  EFLAGS: 00010257
[Tue Aug  2 11:15:58 2016] RAX: 0000000000000000 RBX: 000000000000000b
RCX: 0000000000000000
[Tue Aug  2 11:15:58 2016] RDX: 0000000000000000 RSI: 0000000000000001
RDI: ffff88071de3b300
[Tue Aug  2 11:15:58 2016] RBP: ffff88039d84fcc0 R08: ffff881299278000
R09: 0000000000000000
[Tue Aug  2 11:15:58 2016] R10: fffffffffffffdcd R11: 0000000000000019
R12: 0000000000000253
[Tue Aug  2 11:15:58 2016] R13: 0000000000000014 R14: fffffffffffffdf0
R15: ffff880036537080
[Tue Aug  2 11:15:58 2016] FS:  00007fe293b28700(0000)
GS:ffff88103f680000(0000) knlGS:0000000000000000
[Tue Aug  2 11:15:58 2016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Tue Aug  2 11:15:58 2016] CR2: 0000559fdc257d10 CR3: 000000141238d000
CR4: 00000000001406e0
[Tue Aug  2 11:15:58 2016] Stack:
[Tue Aug  2 11:15:58 2016]  0000000000016ac0 fffffffffffffdf5
0000000000000253 ffff881299278000
[Tue Aug  2 11:15:58 2016]  ffff881299278000 ffffffffffffffd5
0000000000000019 ffff880036537080
[Tue Aug  2 11:15:58 2016]  ffff88039d84fcc0 00000000000000ca
00000000000000ee 0000000000000015
[Tue Aug  2 11:15:58 2016] Call Trace:
[Tue Aug  2 11:15:58 2016]  [<ffffffff8116722e>] ? task_numa_migrate+0x2be/0x8d0
[Tue Aug  2 11:15:58 2016]  [<ffffffff8116a684>] ? task_numa_fault+0xab4/0xd50
[Tue Aug  2 11:15:58 2016]  [<ffffffff81169a42>] ?
should_numa_migrate_memory+0x52/0x120
[Tue Aug  2 11:15:58 2016]  [<ffffffff81246ca4>] ? mpol_misplaced+0xd4/0x180
[Tue Aug  2 11:15:58 2016]  [<ffffffff81229b6c>] ? handle_mm_fault+0xe0c/0x1590
[Tue Aug  2 11:15:58 2016]  [<ffffffff810a1278>] ? __do_page_fault+0x178/0x410
[Tue Aug  2 11:15:58 2016]  [<ffffffff816b9818>] ? page_fault+0x28/0x30
[Tue Aug  2 11:15:58 2016] Code: 18 4c 89 ef e8 31 c2 ff ff 49 8b 85
a8 00 00 00 31 d2 49 0f af 87 00 01 00 00 49 8b 4d 70 4c 8b 6d 20 4c
8b 44 24 18 48 83 c1 01 <48> f7 f1 49 89 c7 49 29 c5 4c 03 7d 48 4d 39
f4 48 8b 4d 78 7e
[Tue Aug  2 11:15:58 2016] RIP  [<ffffffff81166b31>]
task_numa_find_cpu+0x1b1/0x5f0
[Tue Aug  2 11:15:58 2016]  RSP <ffff88039d84fc30>
[Tue Aug  2 11:15:58 2016] ---[ end trace 7aa8747e90bb7d77 ]---

The rest of the OSDs on that node are still responsive but we can't do
a process listing and the 15 minute load is holding at 350+.

A rack of rbd clients kernel crashed (no networking stack but the
kernels are spamming the serial consoles with this:

[310363.138601] kernel BUG at drivers/block/rbd.c:4638!
[310363.143843] invalid opcode: 0000 [#1] SMP
[310363.148204] Modules linked in: rbd libceph sg rpcsec_gss_krb5
xt_UDPLB(O) xt_nat xt_multiport xt_addrtype iptable_mangle iptable_raw
iptable_nat nf_nat_ipv4 nf_nat ext4 jbd2 mbcache x86_pkg_temp_thermal
gkuart(O) usbserial ie31200_edac edac_core tpm_tis raid1 crc32c_intel
[310363.175783] CPU: 6 PID: 15231 Comm: kworker/u16:1 Tainted: G
    O    4.7.0-vanilla-ams-3 #1
[310363.185246] Hardware name: Quanta T6BC-S1N/T6BC, BIOS T6BC2A01 03/26/2014
[310363.192374] Workqueue: ceph-watch-notify do_watch_notify [libceph]
[310363.198969] task: ffff880097438d40 ti: ffff88030b0e8000 task.ti:
ffff88030b0e8000
[310363.206949] RIP: 0010:[<ffffffffa01731c9>]  [<ffffffffa01731c9>]
rbd_dev_header_info+0x5a9/0x940 [rbd]
[310363.216839] RSP: 0018:ffff88030b0ebd30  EFLAGS: 00010286
[310363.222480] RAX: 0000000000000077 RBX: ffff88030d2ac800 RCX:
0000000000000000
[310363.230114] RDX: 0000000000000077 RSI: ffff88041fd8dd08 RDI:
ffff88041fd8dd08
[310363.237747] RBP: ffff88030b0ebd98 R08: 0000000000000030 R09:
0000000000000000
[310363.245391] R10: 0000000000000000 R11: 0000000000000d44 R12:
ffff88037b105000
[310363.253089] R13: ffff88030d2ac9b0 R14: 0000000000000000 R15:
ffff88006e020a00
[310363.260786] FS:  0000000000000000(0000) GS:ffff88041fd80000(0000)
knlGS:0000000000000000
[310363.269377] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[310363.275456] CR2: 00007f8f0800a048 CR3: 00000002afe5f000 CR4:
00000000001406e0
[310363.283090] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[310363.290724] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[310363.298364] Stack:
[310363.300700]  ffffffff8113a91a ffff880097438d40 ffff88041fd97ef0
ffff88041fd97ef0
[310363.308940]  ffff88041fd97ef0 000000000006625c ffff88030b0ebdd8
ffffffff8113d968
[310363.317304]  ffff88030d2ac800 ffff88037b105000 ffff88030d2ac9b0
0000000000000000
[310363.325619] Call Trace:
[310363.328503]  [<ffffffff8113a91a>] ? update_curr+0x8a/0x110
[310363.334350]  [<ffffffff8113d968>] ? dequeue_task_fair+0x618/0x1150
[310363.340872]  [<ffffffffa0173591>] rbd_dev_refresh+0x31/0xf0 [rbd]
[310363.347322]  [<ffffffffa0173719>] rbd_watch_cb+0x29/0xa0 [rbd]
[310363.353569]  [<ffffffffa013efdc>] do_watch_notify+0x4c/0x80 [libceph]
[310363.360339]  [<ffffffff811258e9>] process_one_work+0x149/0x3c0
[310363.366532]  [<ffffffff81125bae>] worker_thread+0x4e/0x490
[310363.372351]  [<ffffffff8185a9f5>] ? __schedule+0x225/0x6f0
[310363.378172]  [<ffffffff81125b60>] ? process_one_work+0x3c0/0x3c0
[310363.384523]  [<ffffffff81125b60>] ? process_one_work+0x3c0/0x3c0
[310363.390858]  [<ffffffff8112b1e9>] kthread+0xc9/0xe0
[310363.396065]  [<ffffffff8185e4ff>] ret_from_fork+0x1f/0x40
[310363.401808]  [<ffffffff8112b120>] ? kthread_create_on_node+0x170/0x170
[310363.408672] Code: 0b 44 8b 6d b8 e9 1d ff ff ff 48 c7 c1 f0 60 17
a0 ba 1e 12 00 00 48 c7 c6 90 6e 17 a0 48 c7 c7 20 58 17 a0 31 c0 e8
8a fd 07 e1 <0f> 0b 75 14 49 8b 7f 68 41 bd 92 ff ff ff e8 d4 e0 fc ff
e9 dc
[310363.433950] RIP  [<ffffffffa01731c9>] rbd_dev_header_info+0x5a9/0x940 [rbd]
[310363.441329]  RSP <ffff88030b0ebd30>
[310363.445232] ---[ end trace eca4993be8f8ac7f ]---
[310363.450313] BUG: unable to handle kernel paging request at ffffffffffffffd8
[310363.457786] IP: [<ffffffff8112b821>] kthread_data+0x11/0x20
[310363.463799] PGD 1e0a067 PUD 1e0c067 PMD 0
[310363.468573] Oops: 0000 [#2] SMP
[310363.472072] Modules linked in: rbd libceph sg rpcsec_gss_krb5
xt_UDPLB(O) xt_nat xt_multiport xt_addrtype iptable_mangle iptable_raw
iptable_nat nf_nat_ipv4 nf_nat ext4 jbd2 mbcache x86_pkg_temp_thermal
gkuart(O) usbserial ie31200_edac edac_core tpm_tis raid1 crc32c_intel
[310363.499255] CPU: 6 PID: 15231 Comm: kworker/u16:1 Tainted: G
D    O    4.7.0-vanilla-ams-3 #1
[310363.508717] Hardware name: Quanta T6BC-S1N/T6BC, BIOS T6BC2A01 03/26/2014
[310363.515845] task: ffff880097438d40 ti: ffff88030b0e8000 task.ti:
ffff88030b0e8000
[310363.523827] RIP: 0010:[<ffffffff8112b821>]  [<ffffffff8112b821>]
kthread_data+0x11/0x20
[310363.532444] RSP: 0018:ffff88030b0eba28  EFLAGS: 00010002
[310363.538110] RAX: 0000000000000000 RBX: ffff88041fd97e80 RCX:
0000000000000006
[310363.545750] RDX: ffff88040f005000 RSI: ffff880097438d40 RDI:
ffff880097438d40
[310363.553390] RBP: ffff88030b0eba30 R08: 0000000000000000 R09:
0000000000001000
[310363.561030] R10: 0000000000000000 R11: ffffea0003654801 R12:
0000000000017e80
[310363.568671] R13: 0000000000000000 R14: ffff880097439200 R15:
ffff880097438d40
[310363.576308] FS:  0000000000000000(0000) GS:ffff88041fd80000(0000)
knlGS:0000000000000000
[310363.584926] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[310363.590997] CR2: 0000000000000028 CR3: 00000002afe5f000 CR4:
00000000001406e0
[310363.598650] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[310363.606285] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[310363.613923] Stack:
[310363.616386]  ffffffff8112645e ffff88030b0eba78 ffffffff8185ab3e
ffff880097438d40
[310363.624653]  ffff88030b0eba90 ffff88030b0ec000 ffff88030b0ebad0
ffff88030b0eb6e8
[310363.632888]  ffff88040d5c8000 0000000000000000 ffff88030b0eba90
ffffffff8185aef5
[310363.645877] Call Trace:
[310363.648649]  [<ffffffff8112645e>] ? wq_worker_sleeping+0xe/0x90
[310363.654896]  [<ffffffff8185ab3e>] __schedule+0x36e/0x6f0
[310363.660538]  [<ffffffff8185aef5>] schedule+0x35/0x80
[310363.665836]  [<ffffffff81110ff9>] do_exit+0x739/0xb50
[310363.671212]  [<ffffffff8108833c>] oops_end+0x9c/0xd0
[310363.676505]  [<ffffffff810887ab>] die+0x4b/0x70
[310363.681371]  [<ffffffff81085b26>] do_trap+0xb6/0x150
[310363.686663]  [<ffffffff81085d87>] do_error_trap+0x77/0xe0
[310363.692396]  [<ffffffffa01731c9>] ? rbd_dev_header_info+0x5a9/0x940 [rbd]
[310363.699514]  [<ffffffff811d7a3d>] ? irq_work_queue+0x6d/0x80
[310363.705505]  [<ffffffff811575d4>] ? wake_up_klogd+0x34/0x40
[310363.711407]  [<ffffffff81157aa6>] ? console_unlock+0x4c6/0x510
[310363.717569]  [<ffffffff810863c0>] do_invalid_op+0x20/0x30
[310363.723298]  [<ffffffff8185fb6e>] invalid_op+0x1e/0x30
[310363.728777]  [<ffffffffa01731c9>] ? rbd_dev_header_info+0x5a9/0x940 [rbd]
[310363.735896]  [<ffffffff8113a91a>] ? update_curr+0x8a/0x110
[310363.741716]  [<ffffffff8113d968>] ? dequeue_task_fair+0x618/0x1150
[310363.748226]  [<ffffffffa0173591>] rbd_dev_refresh+0x31/0xf0 [rbd]
[310363.754649]  [<ffffffffa0173719>] rbd_watch_cb+0x29/0xa0 [rbd]
[310363.760814]  [<ffffffffa013efdc>] do_watch_notify+0x4c/0x80 [libceph]
[310363.767586]  [<ffffffff811258e9>] process_one_work+0x149/0x3c0
[310363.773746]  [<ffffffff81125bae>] worker_thread+0x4e/0x490
[310363.779562]  [<ffffffff8185a9f5>] ? __schedule+0x225/0x6f0
[310363.785374]  [<ffffffff81125b60>] ? process_one_work+0x3c0/0x3c0
[310363.791716]  [<ffffffff81125b60>] ? process_one_work+0x3c0/0x3c0
[310363.798058]  [<ffffffff8112b1e9>] kthread+0xc9/0xe0
[310363.803264]  [<ffffffff8185e4ff>] ret_from_fork+0x1f/0x40
[310363.808993]  [<ffffffff8112b120>] ? kthread_create_on_node+0x170/0x170
[310363.815849] Code: 02 00 00 00 e8 a1 fd ff ff 5d c3 0f 1f 44 00 00
66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 60 04 00 00 55
48 89 e5 5d <48> 8b 40 d8 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
00 55
[310363.841079] RIP  [<ffffffff8112b821>] kthread_data+0x11/0x20
[310363.847159]  RSP <ffff88030b0eba28>
[310363.850977] CR2: ffffffffffffffd8
[310363.854624] ---[ end trace eca4993be8f8ac80 ]---
[310363.859568] Fixing recursive fault but reboot is needed!

Unfortunately these weren't getting logged to disk at the time the
crash happened.

In the logs for osd.14 I found this on 7/30:

2016-07-30 02:31:20.573018 7fe28d73f700  0 -- 172.20.2.63:6802/51944
>> 172.20.2.63:6818/57494 pipe(0x559fb7dd7000 sd=143 :6802 s=0 pgs=0
cs=0 l=0 c=0x559f54624580).accept connect_seq 4 vs existing 3 state
standby
2016-07-30 02:32:48.446507 7fe2912dd700  0 bad crc in data 2422823894
!= exp 1069346241
559f3d5418c02:32:48.464973 7fe2c37f0700  0 -- 10.10.2.63:6802/51944
submit_message osd_op_reply(5438
rbd_data.1d22311949ab7a.0000000000000028 [set-alloc-hint object_size
4194304 write_size 4194304,write 106496~4096] v107879'7535 uv7535
ondisk = 0) v6 remote, 10.9.5.23:0/574015403, failed lossy con,
dropping message 0
2016-07-30 02:38:57.432169 7fe29c34c700  0 -- 172.20.2.63:6802/51944
>> 172.20.3.63:6812/9319 pipe(0x559f6308c000 sd=236 :47180 s=2 pgs=10
cs=1 l=0 c=0x559f29944c60).fault with nothing to send, going to
standby

The rest of the messages on 08/02 look like this:

2016-08-02 08:53:21.305431 7fe2ae49e700  0 log_channel(cluster) log
[INF] : 1.313 scrub ok
2016-08-02 10:00:55.230664 7fe2aec9f700  0 log_channel(cluster) log
[INF] : 2.30d scrub starts
2016-08-02 10:00:55.232653 7fe2aec9f700  0 log_channel(cluster) log
[INF] : 2.30d scrub ok
2016-08-02 11:18:05.074495 7fe2ab498700 -1 osd.14 114237
heartbeat_check: no reply from osd.6 since back 2016-08-02
11:17:44.568097 front 2016-08-02 11:18:00.972352 (cutoff 2016-08-02
11:17:45.074414)
2016-08-02 11:18:05.458396 7fe2c964a700 -1 osd.14 114237
heartbeat_check: no reply from osd.6 since back 2016-08-02
11:17:44.568097 front 2016-08-02 11:18:05.073552 (cutoff 2016-08-02
11:17:45.458393)
2016-08-02 11:18:06.458605 7fe2c964a700 -1 osd.14 114237
heartbeat_check: no reply from osd.6 since back 2016-08-02
11:17:44.568097 front 2016-08-02 11:18:05.073552 (cutoff 2016-08-02
11:17:46.458602)
...
2016-08-02 11:19:35.404897 7fe2ab498700 -1 osd.14 114245
heartbeat_check: no reply from osd.6 since back 2016-08-02
11:17:44.568097 front 2016-08-02 11:19:30.702775 (cutoff 2016-08-02
11:19:15.404896)
2016-08-02 11:19:35.472022 7fe2c964a700 -1 osd.14 114245
heartbeat_check: no reply from osd.6 since back 2016-08-02
11:17:44.568097 front 2016-08-02 11:19:35.404031 (cutoff 2016-08-02
11:19:15.472020)
EOF

-- 
Victor Payno
ビクター·ペイン

Sr. Release Engineer
シニアリリースエンジニア

Gaikai, a Sony Computer Entertainment Company   ∆○×□
ガイカイ、ソニー・コンピュータエンタテインメント傘下会社
65 Enterprise
Aliso Viejo, CA 92656 USA

Web: www.gaikai.com
Email: vpayno@xxxxxxxxxx
Phone: (949) 330-6850
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html