I have two NAS devices running an almost identical workload. One of them has been perfectly stable for over a year now. On the other, tgtd either aborts, or causes the iscsi mounted file systems to go into read-only mode about once a week. I wanted to lay out my configuration to see if there is a most-likely cause. One thing that stands out is the scsi-target-utils version is 1.0.4 on the unstable server, and 1.0.8 on the stable server. yum update on CentOS 6 says 1.0.4 is the most recent though and I see patches through Jan 17, 2011. Other potential causes -- bonded ethernet ports on the unstable one, and no swap partition on the unstable one (the OS is installed on a compact-flash card). >From the abrt logs: Process /usr/sbin/tgtd was killed by signal 11 (SIGSEGV) Which <might> be related to this bug: https://bugzilla.redhat.com/show_bug.cgi?id=712807 Any insight would be appreciated -- thanks in advance. On the STABLE server: ================================ Kernel: Linux version 2.6.18-238.12.1.el5 OS: CentOS release 5.6 (Final) Physical RAM: 4G SWAP: 6G scsi-target-utils-1.0.8-0.el5_6.1 iscsi-initiator-utils-6.2.0.872-6.el5 Disk Array is Soft-RAID10 ISCSI volumes are sparse files on a XFS file system Operating system is installed on an ext3 partition on the main disk array Single ethernet port hosts main IP On the UNSTABLE server: =============================== Kernel: Linux version 2.6.32-71.29.1.el6.x86_64 OS: CentOS Linux release 6.0 (Final) Physical RAM: 8G SWAP: None scsi-target-utils-1.0.4-3.el6_0.1.x86_64 iscsi-initiator-utils-6.2.0.872-10.el6.x86_64 Disk array is Soft-RAID6 ISCSI volumes are sparse files on a XFS file system Operating system is installed on compact-flash card (not part of the data disk array) Two ethernet ports are bonded to host the main IP Typical log file entry in /var/log/messages ================================= Nov 30 11:14:51 san2 tgtd: conn_close(100) connection closed, 0x20786d8 1 Nov 30 11:14:51 san2 tgtd: conn_close(106) sesson 0x20db690 1 Nov 30 13:43:42 san2 kernel: tgtd[19686]: segfault at 0 ip 0000000000415edf sp 00007fff10009f30 error 4 in tgtd (deleted)[400000+2f000] Nov 30 13:43:43 san2 tgtd: abort_task_set(1008) found 9 0 Nov 30 13:46:57 san2 abrt[26802]: file /usr/sbin/tgtd seems to be deleted Nov 30 13:47:28 san2 abrt[26802]: saved core dump of pid 19686 (/usr/sbin/tgtd) to /var/spool/abrt/ccpp-1322678817-19686.new/coredump (649093120 bytes) Nov 30 13:47:28 san2 abrtd: Directory 'ccpp-1322678817-19686' creation detected Nov 30 13:47:28 san2 abrtd: Size of '/var/spool/abrt' >= 1000 MB, deleting 'ccpp-1320310886-6236' Nov 30 13:47:32 san2 abrtd: New crash /var/spool/abrt/ccpp-1322678817-19686, processing -- To unsubscribe from this list: send the line "unsubscribe stgt" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html