At Tue, 4 Mar 2014 17:14:41 +0200, Or Gerlitz wrote: > > On 26/02/2014 07:13, Hitoshi Mitake wrote: > > For example, we can produce segfault of tgtd under heavy load. Below > > is a backtrace obtained from the core file: > > (gdb) bt > > #0 0x0000000000000000 in ?? () > > #1 0x0000000000411419 in event_loop () at tgtd.c:414 > > #2 0x0000000000411b65 in main (argc=<value optimized out>, argv=<value optimized out>) at tgtd.c:591 > > > > To be honest, I still don't find an event handler which calls > > tgt_event_del() for other fds. But with this modification, the above > > segfault is avoided. The change seems to be effective. > > Just want to make sure I follow --- you do have a way to reproduce the > bug, but from code inspection you didn't find an event handler in tgt > which calls tgt_event_del() for "other" fds which is the trigger for > the bug, right? Yes. To be more precise, I'd like to describe my understanding: 1. There are some event handlers which can close "other fds". e.g. mtask_recv_send_handler(). It can be invoked via input of unix domain socket and close fds of tcp connections when user invokes "--op delete --mode target". 2. But we can produce the above segfault without using it... > > Can you please provide the steps to reproduce the bug? We are using 4 node cluster connected via 10Gbps ethernet. 1 node executes tgtd for providing iSCSI target. The backing store is sheepdog. All nodes execute 4 VMs and read iso file (about 4GB) from single logical unit. # The test is mocking an environment of thin clients. We need multiple # dd processes for exhausting the 10Gbps network. When we run the above test several times, the segfault occurs. But, of course, we don't invoke the "tgtadm --op delete --mode target" during the testing. Thanks, Hitoshi -- To unsubscribe from this list: send the line "unsubscribe stgt" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html