EnhanceIO? I'd say get rid of that first and then try reproducing it. Jan > On 03 Sep 2015, at 03:14, Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote: > > e have experienced a repeatable issue when performing the following: > > Ceph backend with no issues, we can repeat any time at will in lab and > production. Cloning an ESXi VM to another VM on the same datastore on > which the original VM resides. Practically instantly, the LIO machine > becomes unresponsive, Pacemaker fails over to another LIO machine and > that too becomes unresponsive. > > Both running Ubuntu 14.04, kernel 4.1 (4.1.0-040100-generic x86_64), > Ceph Hammer 0.94.2, and have been able to take quite a workoad with no > issues. > > output of /var/log/syslog below. I also have a screen dump of a > frozen system - attached. > > Thank you, > Alex > > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886254] CPU: 22 PID: > 18130 Comm: kworker/22:1 Tainted: G C OE > 4.1.0-040100-generic #201506220235 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886303] Hardware name: > Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a > 12/05/2013 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886364] Workqueue: > xcopy_wq target_xcopy_do_work [target_core_mod] > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886395] task: > ffff8810441c3250 ti: ffff88105bb40000 task.ti: ffff88105bb40000 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886440] RIP: > 0010:[<ffffffffc03e4529>] [<ffffffffc03e4529>] > sbc_check_prot+0x49/0x210 [target_core_mod] > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886498] RSP: > 0018:ffff88105bb43b88 EFLAGS: 00010246 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886525] RAX: > 0000000000000400 RBX: ffff8810589eb008 RCX: 0000000000000400 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886554] RDX: > ffff8810589eb0f8 RSI: 0000000000000000 RDI: 0000000000000000 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886584] RBP: > ffff88105bb43bc8 R08: 0000000000000000 R09: 0000000000000001 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886613] R10: > 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886643] R13: > ffff88084860c000 R14: ffffffffc02372c0 R15: 0000000000000400 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886673] FS: > 0000000000000000(0000) GS:ffff88105f480000(0000) > knlGS:0000000000000000 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886719] CS: 0010 DS: > 0000 ES: 0000 CR0: 0000000080050033 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886747] CR2: > 0000000000000010 CR3: 0000000001e0f000 CR4: 00000000001407e0 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886777] Stack: > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886798] 0000000b00000000 > 000000000000000c 0000000000000000 ffff8810589eb0f8 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886851] ffff8810589eb008 > ffff88084860c000 ffffffffc02372c0 0000000000000400 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886904] ffff88105bb43c28 > ffffffffc03e528a 0000000c00000000 000400000000000c > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886957] Call Trace: > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.886989] > [<ffffffffc03e528a>] sbc_parse_cdb+0x66a/0xa20 [target_core_mod] > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887022] > [<ffffffffc0233195>] iblock_parse_cdb+0x15/0x20 [target_core_iblock] > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887077] > [<ffffffffc03de950>] target_setup_cmd_from_cdb+0x1c0/0x260 > [target_core_mod] > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887133] > [<ffffffffc03ed1bd>] target_xcopy_setup_pt_cmd+0x8d/0x170 > [target_core_mod] > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887188] > [<ffffffffc03edb16>] target_xcopy_read_source.isra.12+0x126/0x220 > [target_core_mod] > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887243] > [<ffffffff81020509>] ? sched_clock+0x9/0x10 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887279] > [<ffffffffc03edf01>] target_xcopy_do_work+0xf1/0x370 [target_core_mod] > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887329] > [<ffffffff810146a6>] ? __switch_to+0x1e6/0x580 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887361] > [<ffffffff81096414>] process_one_work+0x144/0x490 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887390] > [<ffffffff81096e7e>] worker_thread+0x11e/0x460 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887418] > [<ffffffff81096d60>] ? create_worker+0x1f0/0x1f0 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887449] > [<ffffffff8109ce59>] kthread+0xc9/0xe0 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887477] > [<ffffffff8109cd90>] ? flush_kthread_worker+0x90/0x90 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887510] > [<ffffffff8180d6a2>] ret_from_fork+0x42/0x70 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.887538] > [<ffffffff8109cd90>] ? flush_kthread_worker+0x90/0x90 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.890342] Code: 7d f8 49 89 > fd 4c 89 65 e0 44 0f b6 62 01 41 89 cf 48 8b be 80 00 00 00 41 8b b5 > 18 04 00 00 41 c0 ec 05 48 83 bb f0 01 00 00 00 <8b> 4f 10 41 89 f6 74 > 0a 8b 83 f8 01 00 00 85 c0 75 14 45 84 e4 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.890580] RIP > [<ffffffffc03e4529>] sbc_check_prot+0x49/0x210 [target_core_mod] > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.890636] RSP <ffff88105bb43b88> > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.890659] CR2: 0000000000000010 > Sep 2 12:11:55 roc-4r-scd214 kernel: [86831.890956] ---[ end trace > 894b2880b8116889 ]--- > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.204150] BUG: unable to > handle kernel paging request at ffffffffffffffd8 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.204291] IP: > [<ffffffff8109d220>] kthread_data+0x10/0x20 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.204392] PGD 1e12067 PUD > 1e14067 PMD 0 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.204563] Oops: 0000 [#2] SMP > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.204695] Modules linked > in: enhanceio_rand(OE) enhanceio_lru(OE) enhanceio_fifo(OE) > enhanceio(OE) target_core_user uio rbd libceph libcrc32c > iscsi_target_mod target_core_file target_core_pscsi target_core_iblock > target_core_mod configfs xt_multiport iptable_filter ip_tables > x_tables ipmi_devintf ipmi_ssif bonding x86_pkg_temp_thermal > intel_powerclamp coretemp kvm crct10dif_pclmul crc32_pclmul > ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper > ablk_helper cryptd 8021q garp mrp stp llc sb_edac joydev edac_core > mei_me lpc_ich mei ioatdma ses enclosure ipmi_si 8250_fintek > ipmi_msghandler wmi shpchp mac_hid lp parport mlx4_en vxlan > ip6_udp_tunnel udp_tunnel hid_generic igb usbhid ahci hid mpt2sas > i2c_algo_bit libahci dca ptp raid_class mlx4_core scsi_transport_sas > pps_core > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.207888] CPU: 22 PID: > 18130 Comm: kworker/22:1 Tainted: G D C OE > 4.1.0-040100-generic #201506220235 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.207972] Hardware name: > Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a > 12/05/2013 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.208062] task: > ffff8810441c3250 ti: ffff88105bb40000 task.ti: ffff88105bb40000 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.208141] RIP: > 0010:[<ffffffff8109d220>] [<ffffffff8109d220>] kthread_data+0x10/0x20 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.208261] RSP: > 0018:ffff88105bb43838 EFLAGS: 00010096 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.208322] RAX: > 0000000000000000 RBX: 0000000000000016 RCX: ffffffff820ea340 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.208374] ABORT_TASK: Found > referenced iSCSI task_tag: 3511431 > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.208375] ABORT_TASK: > ref_tag: 3511431 already complete, skipping > Sep 2 12:12:04 roc-4r-scd214 kernel: [86833.208376] ABORT_TASK: > Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 3511431 > <2015-09-02_21-07-15.png>_______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com