Hi Fujita, Ryusuke, I can't find any core files anywhere. On the other hand, syslog reports this: Jan 24 15:27:25 test-machine tgtd: abort_task_set(1325) found 0 0 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found fd1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found fc1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found fb1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found fa1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f91b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f81b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f71b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f61b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f51b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f41b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f31b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f21b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f11b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found f01b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found ef1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found ee1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found ed1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found ec1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found eb1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found ea1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found e91b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found e81b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found e71b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found e61b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found e51b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found e41b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found e31b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found e11b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found e01b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found df1b0000 6 Jan 24 15:27:25 test-machine tgtd: abort_cmd(1301) found d61b0000 6 Jan 24 15:27:45 test-machine tgtd: conn_close(103) connection closed, 0xed3ec0 26 Jan 24 15:27:45 test-machine tgtd: conn_close(109) sesson 0xdda890 1 Jan 24 15:27:48 test-machine tgtd: tgt_event_modify(241) Cannot find event 11 Jan 24 15:27:48 test-machine tgtd: iscsi_event_modify(557) tgt_event_modify failed Jan 24 15:27:53 test-machine tgtd: tgt_event_modify(241) Cannot find event 11 Jan 24 15:27:53 test-machine tgtd: iscsi_event_modify(557) tgt_event_modify failed Jan 24 15:27:58 test-machine tgtd: tgt_event_modify(241) Cannot find event 11 Jan 24 15:27:58 test-machine tgtd: iscsi_event_modify(557) tgt_event_modify failed Jan 24 15:28:03 test-machine tgtd: tgt_event_modify(241) Cannot find event 11 Jan 24 15:28:03 test-machine tgtd: iscsi_event_modify(557) tgt_event_modify failed Jan 24 15:28:05 test-machine tgtd: conn_close(92) already closed 0xed3ec0 25 Jan 24 15:28:08 test-machine tgtd: tgt_event_modify(241) Cannot find event 11 Jan 24 15:28:08 test-machine tgtd: iscsi_event_modify(557) tgt_event_modify failed Jan 24 15:28:13 test-machine tgtd: iscsi_event_modify(557) tgt_event_modify failed Jan 24 15:28:18 test-machine tgtd: iscsi_tcp_nop_work_handler(110) tcp connection timed out after 6 failed NOP-OUT Jan 24 15:28:24 test-machine tgtd: tgtd logger exits abnormally, pid:3794 Jan 24 15:28:24 test-machine kernel: [3033879.644595] tgtd[3792]: segfault at 0 ip 00000000004076aa sp 00007fffdee18b10 error 6 in tgtd[400000+43000] Jan 24 15:29:26 test-machine kernel: [3033940.814673] init: tgt main process (3792) killed by SEGV signal Jan 24 15:29:26 test-machine kernel: [3033940.814709] init: tgt main process ended, respawning Jan 24 15:29:26 test-machine tgtd: semkey 0x610f435c Jan 24 15:29:26 test-machine tgtd: tgtd daemon started, pid:15001 Jan 24 15:29:26 test-machine tgtd: tgtd logger started, pid:15003 debug:0 Jan 24 15:29:27 test-machine tgtd: iser_ib_init(3349) Failed to initialize RDMA; load kernel modules? Jan 24 15:29:27 test-machine tgtd: work_timer_start(150) use signal based scheduler Jan 24 15:29:27 test-machine tgtd: bs_init(316) use signalfd notification I should point out that after iSCSI targets are created the following commands are executed: sudo tgtadm --op update --mode target --tid 1 -n nop_count -v 6 sudo tgtadm --op update --mode target --tid 1 -n nop_interval -v 5 Thanks for your help. Alban On Fri, Jan 24, 2014 at 8:58 AM, Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx> wrote: > Hi Alban, > On Fri, 24 Jan 2014 14:27:37 +0900 (JST), FUJITA Tomonori wrote: >> On Tue, 21 Jan 2014 11:29:26 +0000 >> Alban Rrustemi <alban@xxxxxxxxxxx> wrote: >> >>> We've been evaluating the tgt version 1.0.38 on a 64bit Linux kernel >>> (version 3.2.0-39-generic) in an Ubuntu installation. Occasionally, we >>> get a segmentation fault in tgtd and it's not clear what went wrong or >>> how to get more information in order to investigate the root cause. >>> >>> All I get to see is lines like the ones below in the kernel log: >>> Jan 21 07:03:40 test-machine kernel: [2744939.501604] tgtd[12887]: >>> segfault at 0 ip 00000000004076aa sp 00007fffe0bcfa40 error 6 in >>> tgtd[400000+43000] >>> Jan 21 07:04:04 test-machine kernel: [2744963.554504] init: tgt main >>> process (12887) killed by SEGV signal >>> >>> Is there any documentation out there or any other type of information >>> on some tgt diagnostics I could use to investigate this? >> >> Unfortunately, I can't tell much with the above. Did you see anything >> in syslog? Anything (workload, etc) changed right before the crash? > > Was there a core file in the root directory or at your home directory > ? > > If it exists, you can get backtrace of the segmentation fault with > gdb, and it may give very helpful information to narrow down the root > cause of the problem. > > Usually, we use gdb for this purpose as follows: > > # gdb tgtd /core.12345 > ... > (gdb) bt > > You may need to install *-dbg package if your tgt has no symbol > information. > > For more details, please see instructions described in the distro > sites like below: > > [1] https://wiki.ubuntu.com/Backtrace > [2] https://wiki.ubuntu.com/DebuggingProcedures > > > Regards, > Ryusuke Konishi -- Dr Alban Rrustemi Co-founder and Director, Fonleap Ltd http://www.fonleap.com -- To unsubscribe from this list: send the line "unsubscribe stgt" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html