Hello, Wondering if anyone else has come over an issue we're having with our POC CEPH Cluster at the moment. Some details about its setup; 6 x Dell R720 (20 x 1TB Drives, 4 xSSD CacheCade), 4 x 10GB Nics 4 x Generic white label server (24 x 2 4TB Disk Raid-0 ), 4 x 10GB Nics 3 x Dell R620 - Acting as ISCSI Heads (targetcli / Linux kernel ISCSI) - 4 x 10GB Nics. An RBD device is mounted and exported via targetcli, this is then mounted on a client device to push backup data. All machines are running Ubuntu 12.04.3 LTS and ceph 0.67.4 Machines are split over two racks (distinct layer 2 domains) using a leaf/spine model and we use ECMP/quagga on the ISCSI heads to reach the CEPH Cluster. Crush map has racks defined to spread data over 2 racks - I've attached the ceph.conf The cluster performs great normally, and we only have issues when simulating rack failure. The issue comes when the following steps are taken o) Initiate load against the cluster (backups going via ISCSI) o) ceph osd set noout o) Reboot 2 x Generic Servers / 3 x Dell Servers (basically all the nodes in 1 Rack) o) Cluster goes degraded, as expected cluster 55dcf929-fca5-49fe-99d0-324a19afd5b4 health HEALTH_WARN 7056 pgs degraded; 282 pgs stale; 2842 pgs stuck unclean; recovery 1286582/2700870 degraded (47.636%); 108/216 in osds are down; noout flag(s) set monmap e3: 5 mons at {fh-ceph01-mon-01=172.17.12.224:6789/0,fh-ceph01-mon-02=172.17.12.225:6789/0,fh-ceph01-mon-03=172.17.11.224:6789/0,fh-ceph01-mon-04=172.17.11.225:6789/0,fh-ceph01-mon-05=172.17.12.226:6789/0}, election epoch 74, quorum 0,1,2,3,4 fh-ceph01-mon-01,fh-ceph01-mon-02,fh-ceph01-mon-03,fh-ceph01-mon-04,fh-ceph01-mon-05 osdmap e4237: 216 osds: 108 up, 216 in pgmap v117686: 7328 pgs: 266 active+clean, 6 stale+active+clean, 6780 active+degraded, 276 stale+active+degraded; 3511 GB data, 10546 GB used, 794 TB / 805 TB avail; 1286582/2700870 degraded (47.636%) mdsmap e1: 0/0/1 up 2013-11-05 08:51:44.830393 mon.0 [INF] pgmap v117685: 7328 pgs: 1489 active+clean, 1289 stale+active+clean, 3215 active+degraded, 1335 stale+active+degraded; 3511 GB data, 10546 GB used, 794 TB / 805 TB avail; 1048742/2700870 degraded (38.830%); recovering 7 o/s, 28969KB/s o) As OSDS start returning 2013-11-05 08:52:42.019295 mon.0 [INF] osd.165 172.17.11.9:6864/6074 boot 2013-11-05 08:52:42.023055 mon.0 [INF] osd.154 172.17.11.9:6828/5943 boot 2013-11-05 08:52:42.024226 mon.0 [INF] osd.159 172.17.11.9:6816/5820 boot 2013-11-05 08:52:42.031996 mon.0 [INF] osd.161 172.17.11.9:6856/6059 boot o) We then see some slow requests; 2013-11-05 08:53:11.677044 osd.153 [WRN] 6 slow requests, 6 included below; oldest blocked for > 30.409992 secs 2013-11-05 08:53:11.677052 osd.153 [WRN] slow request 30.409992 seconds old, received at 2013-11-05 08:52:41.266994: osd_op(client.16010.1:13441679 rb.0.21ec.238e1f29.00000012fa28 [write 2854912~4096] 3.516ef071 RETRY=-1 e4240) currently reached pg 2013-11-05 08:53:11.677056 osd.153 [WRN] slow request 30.423024 seconds old, received at 2013-11-05 08:52:41.253962: osd_op(client.15755.1:13437999 rb.0.21ec.238e1f29.00000012fa28 [write 0~233472] 3.516ef071 RETRY=1 e4240) v4 currently reached pg o) A few minutes , the ISCSI heads start panicking Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664305] ------------[ cut here ]------------ Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664313] WARNING: at /build/buildd/linux-lts-raring-3.8.0/kernel/watchdog.c:246 wat chdog_overflow_callback+0x9a/0xc0() Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664315] Hardware name: PowerEdge R620 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664317] Watchdog detected hard LOCKUP on cpu 6 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664318] Modules linked in: ib_srpt(F) tcm_qla2xxx(F) tcm_loop(F) tcm_fc(F) iscsi_t arget_mod(F) target_core_pscsi(F) target_core_file(F) target_core_iblock(F) target_core_mod(F) rbd(F) libceph(F) ipmi_devintf(F) ipm i_si(F) ipmi_msghandler(F) qla2xxx(F) libfc(F) scsi_transport_fc(F) scsi_tgt(F) configfs(F) dell_rbu(F) ib_iser(F) rdma_cm(F) ib_cm( F) iw_cm(F) ib_sa(F) ib_mad(F) ib_core(F) ib_addr(F) ext2(F) iscsi_tcp(F) libiscsi_tcp(F) libiscsi(F) scsi_transport_iscsi(F) corete mp(F) kvm_intel(F) kvm(F) ghash_clmulni_intel(F) aesni_intel(F) ablk_helper(F) cryptd(F) lrw(F) aes_x86_64(F) xts(F) gf128mul(F) gpi o_ich(F) dcdbas(F) microcode(F) joydev(F) shpchp(F) sb_edac(F) wmi(F) edac_core(F) acpi_power_meter(F) mei(F) lpc_ich(F) mac_hid(F) 8021q(F) garp(F) stp(F) llc(F) lp(F) parport(F) hid_generic(F) usbhid(F) hid(F) ahci(F) libahci(F) ixgbe(F) dca(F) megaraid_sas(F) m dio(F) tg3(F) ptp(F) pps_core(F) btrfs(F) zlib_deflate(F) libcrc32c(F) [last unloaded: target_core_mod] Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664387] Pid: 460, comm: kworker/u:5 Tainted: GF W 3.8.0-31-generic #46~pr ecise1-Ubuntu Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664389] Call Trace: Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664391] <NMI> [<ffffffff81059b2f>] warn_slowpath_common+0x7f/0xc0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664402] [<ffffffff81059c26>] warn_slowpath_fmt+0x46/0x50 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664406] [<ffffffff810ed6fa>] watchdog_overflow_callback+0x9a/0xc0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664410] [<ffffffff8112c75c>] __perf_event_overflow+0x9c/0x220 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664415] [<ffffffff81025aca>] ? x86_perf_event_set_period+0xda/0x150 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664418] [<ffffffff8112cf54>] perf_event_overflow+0x14/0x20 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664423] [<ffffffff8102b8db>] intel_pmu_handle_irq+0x19b/0x250 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664429] [<ffffffff816f5c71>] perf_event_nmi_handler+0x21/0x30 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664432] [<ffffffff816f53a9>] nmi_handle.isra.0+0x59/0x80 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664435] [<ffffffff816f54b8>] default_do_nmi+0x58/0x240 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664438] [<ffffffff816f5730>] do_nmi+0x90/0xd0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664442] [<ffffffff816f49c1>] end_repeat_nmi+0x1e/0x2e Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664447] [<ffffffff81045c42>] ? __ticket_spin_lock+0x22/0x30 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664450] [<ffffffff81045c42>] ? __ticket_spin_lock+0x22/0x30 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664453] [<ffffffff81045c42>] ? __ticket_spin_lock+0x22/0x30 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664454] <<EOE>> [<ffffffff81045cd9>] default_spin_lock_flags+0x9/0x10 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664460] [<ffffffff816f400e>] _raw_spin_lock_irqsave+0x2e/0x40 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664466] [<ffffffff8108a442>] __wake_up+0x32/0x70 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664477] [<ffffffffa03687d2>] iscsit_add_cmd_to_response_queue+0xb2/0xe0 [iscsi_ta rget_mod] Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664483] [<ffffffffa0374e28>] lio_queue_tm_rsp+0x28/0x30 [iscsi_target_mod] Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664492] [<ffffffffa03a79ee>] target_tmr_work+0x6e/0xf0 [target_core_mod] Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664496] [<ffffffff81078d01>] process_one_work+0x141/0x490 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664500] [<ffffffff81079cc8>] worker_thread+0x168/0x400 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664503] [<ffffffff81079b60>] ? manage_workers+0x120/0x120 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664506] [<ffffffff8107f1d0>] kthread+0xc0/0xd0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664510] [<ffffffff8107f110>] ? flush_kthread_worker+0xb0/0xb0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664514] [<ffffffff816fcb2c>] ret_from_fork+0x7c/0xb0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664517] [<ffffffff8107f110>] ? flush_kthread_worker+0xb0/0xb0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69081.664519] ---[ end trace 9b6438fe7cb20627 ]--- Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602470] INFO: rcu_sched detected stalls on CPUs/tasks: { 6} (detected by 21, t=150 02 jiffies, g=18619, c=18618, q=7934) Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602570] sending NMI to all CPUs: Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602574] NMI backtrace for cpu 0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602577] CPU 0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602581] Pid: 0, comm: swapper/0 Tainted: GF W 3.8.0-31-generic #46~precis e1-Ubuntu Dell Inc. PowerEdge R620/0PXXHP Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602583] RIP: 0010:[<ffffffff8158b2e2>] [<ffffffff8158b2e2>] poll_idle+0x42/0x90 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602590] RSP: 0018:ffffffff81c01db8 EFLAGS: 00000246 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602592] RAX: 0000000400000000 RBX: 0000000000000000 RCX: 0000000000000018 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602594] RDX: ffffffff81c01fd8 RSI: ffffffff81c7a380 RDI: ffffffff81c1af00 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602596] RBP: ffffffff81c01df8 R08: 0000000000000020 R09: 0000000000000000 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602598] R10: 0000000000002000 R11: 0000000000000000 R12: ffff880fffc1b370 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602599] R13: 00003ef672bf9148 R14: ffffffff81c7a380 R15: 0000000000000000 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602602] FS: 0000000000000000(0000) GS:ffff880fffc00000(0000) knlGS:00000000000000 00 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602604] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602605] CR2: 00007fffd8f0f000 CR3: 0000000fda277000 CR4: 00000000000407f0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602607] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602609] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602611] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c15 440) Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602613] Stack: Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602614] 0000000000000000 0000000006e86f73 ffffffff81c01dd8 ffffffff8101ba59 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602620] ffffffff81c01e08 ffff880fffc1b370 00003ef672bf9129 ffffffff8158ae40 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602625] ffffffff81c01e08 ffffffff8158ae59 ffffffff81c01e68 ffffffff8158b8d1 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602629] Call Trace: Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602634] [<ffffffff8101ba59>] ? read_tsc+0x9/0x20 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602638] [<ffffffff8158ae40>] ? centrino_target+0x360/0x360 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602641] [<ffffffff8158ae59>] cpuidle_enter+0x19/0x20 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602645] [<ffffffff8158b8d1>] cpuidle_wrap_enter+0x41/0xa0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602648] [<ffffffff8158cd1e>] ? menu_select+0x16e/0x2b0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602651] [<ffffffff8158b940>] cpuidle_enter_tk+0x10/0x20 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602654] [<ffffffff8158b4ff>] cpuidle_idle_call+0xaf/0x2b0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602658] [<ffffffff8101da8f>] cpu_idle+0xcf/0x120 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602664] [<ffffffff816c7082>] rest_init+0x72/0x80 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602669] [<ffffffff81d05c4f>] start_kernel+0x3d1/0x3de Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602673] [<ffffffff81d057ff>] ? pass_bootoption.constprop.2+0xd3/0xd3 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602676] [<ffffffff81d05397>] x86_64_start_reservations+0x131/0x135 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602680] [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602683] [<ffffffff81d05468>] x86_64_start_kernel+0xcd/0xdc Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602685] Code: e8 84 36 b2 ff 49 89 c5 fb 66 66 90 66 66 90 65 48 8b 14 25 30 c7 00 00 48 8b 82 38 e0 ff ff a8 08 75 13 66 0f 1f 44 00 00 f3 90 <48> 8b 82 38 e0 ff ff a8 08 74 f3 e8 4e 36 b2 ff 4c 29 e8 48 89 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602718] NMI backtrace for cpu 1 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602721] CPU 1 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602725] Pid: 0, comm: swapper/1 Tainted: GF W 3.8.0-31-generic #46~precis e1-Ubuntu Dell Inc. PowerEdge R620/0PXXHP Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602728] RIP: 0010:[<ffffffff8158b2e9>] [<ffffffff8158b2e9>] poll_idle+0x49/0x90 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602735] RSP: 0018:ffff880fe9545de8 EFLAGS: 00000246 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602737] RAX: 0000000400000000 RBX: 0000000000000000 RCX: 0000000000000018 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602739] RDX: ffff880fe9545fd8 RSI: ffffffff81c7a380 RDI: ffffffff81c1af00 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602741] RBP: ffff880fe9545e28 R08: 0000000000000020 R09: 0000000000000000 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602743] R10: 0000000000002000 R11: 0000000000000000 R12: ffff88203fc1b370 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602745] R13: 00003ef672bf7e9f R14: ffffffff81c7a380 R15: 0000000000000000 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602747] FS: 0000000000000000(0000) GS:ffff88203fc00000(0000) knlGS:00000000000000 00 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602749] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602751] CR2: 00007f1dc4053038 CR3: 0000001fd97aa000 CR4: 00000000000407e0 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602753] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602755] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Nov 5 08:56:06 fh-ceph01-iscsi-01 kernel: [69125.602757] Process swapper/1 (pid: 0, threadinfo ffff880fe9544000, task ffff880fe9549 740) I'm in the process of building a separate iscsi head on different hardware, just incase it's something related to the M620's but wanted to see if anyone had any ideas? Any input greatly appreciated. I can pretty much replicate this on demand at the moment by enacting the above steps. Cheers,
Attachment:
ceph.conf
Description: ceph.conf
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com