tgt V0.9.7 & V0.9.8 - getting tgtd segfault error 4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Under V0.9.7 received:
Sep 24 05:12:57 storageserver kernel: tgtd[31665]: segfault at 0000555e4ee57d90 rip 0000003dc14715a8 rsp 00007fff7f899ce0 error 4

Upgraded to V0.9.8 and:
Sep 25 01:37:02 storageserver kernel: tgtd[31609]: segfault at fffffffffffffff0 rip 0000000000405ae4 rsp 00007fffc7fe3940 error 4

Not repeatable but happens generally within 24 hours.  There are 143 disks being served in this configuration and multipathd is in use. This is a new installation.  

targets.conf looks like:
<target iqn.2009-06.crrel:storageserver.disks>
backing-store /dev/mapper/mpath1
backing-store /dev/mapper/mpath2
...
backing-store /dev/mapper/mpath143
allow-in-use yes
</target>

Under V0.9.8 we are also seeing something new - BUG: soft lockup's (below) just prior to the segfault.  

Any suggestions? 

Thanks!

Marty

Sep 25 01:32:39 storageserver tgtd: conn_close(130) Forcing release of tx task 0x2aae7d692820 b000003b 1
Sep 25 01:32:57 storageserver kernel: BUG: soft lockup - CPU#1 stuck for 10s! [tgtd:32036]
Sep 25 01:32:57 storageserver kernel: CPU 1:
Sep 25 01:32:57 storageserver kernel: Modules linked in: hangcheck_timer autofs4 ipmi_devintf ipmi_si ipmi_msghandler hidp l2cap bluetooth sunrpc bonding cpufreq_ondemand powernow_k8 freq_table mptctl(U) dm_mirror dm_round_robin dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac ipv6 xfrm_nalgo crypto_api parport_pc lp parport joydev sg ixgbe forcedeth i2c_nforce2 i2c_core pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache usb_storage mptfc(U) scsi_transport_fc mptspi(U) scsi_transport_spi shpchp mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Sep 25 01:32:57 storageserver kernel: Pid: 32036, comm: tgtd Tainted: G      2.6.18-128.el5 #1
Sep 25 01:32:57 storageserver kernel: RIP: 0010:[<ffffffff80064cb4>]  [<ffffffff80064cb4>] .text.lock.spinlock+0x2/0x30
Sep 25 01:32:57 storageserver kernel: RSP: 0018:ffff81080b2fd880  EFLAGS: 00000286
Sep 25 01:32:57 storageserver kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000c0000100
Sep 25 01:32:57 storageserver kernel: RDX: ffff81081ec53d69 RSI: 0000000000000001 RDI: ffff81081ec53d68
Sep 25 01:32:57 storageserver kernel: RBP: ffff810824fa5870 R08: ffff81080b2fc000 R09: 0000000000000286
Sep 25 01:32:57 storageserver kernel: R10: ffff81041ecb2040 R11: ffff810827e3d080 R12: ffff81081c06a348
Sep 25 01:32:57 storageserver kernel: R13: ffff810822d69a68 R14: ffffffff80063097 R15: ffff81080b2fd8a8
Sep 25 01:32:57 storageserver kernel: FS:  00002aab483a6940(0000) GS:ffff81010e959440(0000) knlGS:00000000e347eb90
Sep 25 01:32:57 storageserver kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep 25 01:32:57 storageserver kernel: CR2: 0000003dc1499a50 CR3: 000000040ec65000 CR4: 00000000000006e0
Sep 25 01:32:57 storageserver kernel:
Sep 25 01:32:57 storageserver kernel: Call Trace:
Sep 25 01:32:57 storageserver kernel:  [<ffffffff80021a5f>] page_lock_anon_vma+0x1d/0x26
Sep 25 01:32:57 storageserver kernel:  [<ffffffff8003ba76>] page_referenced+0x43/0xe4
Sep 25 01:32:57 storageserver kernel:  [<ffffffff800c72b6>] shrink_inactive_list+0x191/0x7f9
Sep 25 01:32:57 storageserver kernel:  [<ffffffff800cbded>] page_referenced_one+0x61/0xd0
Sep 25 01:32:57 storageserver kernel:  [<ffffffff80064cb7>] .text.lock.spinlock+0x5/0x30
Sep 25 01:32:57 storageserver kernel:  [<ffffffff8003baa1>] page_referenced+0x6e/0xe4
Sep 25 01:32:57 storageserver kernel:  [<ffffffff80047ab0>] __pagevec_release+0x19/0x22
Sep 25 01:32:57 storageserver kernel:  [<ffffffff800c7004>] shrink_active_list+0x416/0x426
Sep 25 01:32:57 storageserver kernel:  [<ffffffff80012d02>] shrink_zone+0xf6/0x11c
Sep 25 01:32:57 storageserver kernel:  [<ffffffff800c801b>] try_to_free_pages+0x197/0x2b9
Sep 25 01:32:57 storageserver kernel:  [<ffffffff8000f271>] __alloc_pages+0x1cb/0x2ce
Sep 25 01:32:57 storageserver kernel:  [<ffffffff8000fb8a>] generic_file_buffered_write+0x1b0/0x6d3
Sep 25 01:32:57 storageserver kernel:  [<ffffffff80016196>] __generic_file_aio_write_nolock+0x36c/0x3b8
Sep 25 01:32:57 storageserver kernel:  [<ffffffff8003dd13>] do_futex+0x282/0xc3f
Sep 25 01:32:57 storageserver kernel:  [<ffffffff800c2ce5>] generic_file_aio_write_nolock+0x20/0x6c
Sep 25 01:32:57 storageserver kernel:  [<ffffffff800c30b1>] generic_file_write_nolock+0x8f/0xa8
Sep 25 01:32:57 storageserver kernel:  [<ffffffff8009db21>] autoremove_wake_function+0x0/0x2e
Sep 25 01:32:57 storageserver kernel:  [<ffffffff80063097>] thread_return+0x62/0xfe
Sep 25 01:32:57 storageserver kernel:  [<ffffffff800df545>] blkdev_file_write+0x1a/0x1f
Sep 25 01:32:57 storageserver kernel:  [<ffffffff8001659e>] vfs_write+0xce/0x174
Sep 25 01:32:57 storageserver kernel:  [<ffffffff80043876>] sys_pwrite64+0x50/0x70
Sep 25 01:32:57 storageserver kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 25 01:32:57 storageserver kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 25 01:32:57 storageserver kernel:

Sep 25 01:36:29 storageserver kernel: BUG: soft lockup - CPU#2 stuck for 10s! [kswapd0:651]
Sep 25 01:36:29 storageserver kernel: CPU 2:
Sep 25 01:36:29 storageserver kernel: Modules linked in: hangcheck_timer autofs4 ipmi_devintf ipmi_si ipmi_msghandler hidp l2cap bluetooth sunrpc bonding cpufreq_ondemand powernow_k8 freq_table mptctl(U) dm_mirror dm_round_robin dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac ipv6 xfrm_nalgo crypto_api parport_pc lp parport joydev sg ixgbe forcedeth i2c_nforce2 i2c_core pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache usb_storage mptfc(U) scsi_transport_fc mptspi(U) scsi_transport_spi shpchp mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Sep 25 01:36:29 storageserver kernel: Pid: 651, comm: kswapd0 Tainted: G      2.6.18-128.el5 #1
Sep 25 01:36:29 storageserver kernel: RIP: 0010:[<ffffffff80064cb4>]  [<ffffffff80064cb4>] .text.lock.spinlock+0x2/0x30
Sep 25 01:36:29 storageserver kernel: RSP: 0018:ffff810827569b58  EFLAGS: 00000286
Sep 25 01:36:29 storageserver kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
Sep 25 01:36:29 storageserver kernel: RDX: ffff81081ec53d69 RSI: 0000000000000001 RDI: ffff81081ec53d68
Sep 25 01:36:29 storageserver kernel: RBP: 000000000000003e R08: ffff810827568000 R09: 0000000000000286
Sep 25 01:36:29 storageserver kernel: R10: ffff81041ecb2040 R11: 0000000000000000 R12: ffff81041ecb2040
Sep 25 01:36:29 storageserver kernel: R13: ffffffff80063097 R14: ffff810827569b80 R15: ffff81041ecb2040
Sep 25 01:36:29 storageserver kernel: FS:  00002ac5e2ad8a10(0000) GS:ffff81010e9591c0(0000) knlGS:00000000e33fdb90
Sep 25 01:36:29 storageserver kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Sep 25 01:36:29 storageserver kernel: CR2: 0000003dc1499a50 CR3: 0000000000201000 CR4: 00000000000006e0
Sep 25 01:36:29 storageserver kernel:
Sep 25 01:36:29 storageserver kernel: Call Trace:
Sep 25 01:36:29 storageserver kernel:  [<ffffffff80021a5f>] page_lock_anon_vma+0x1d/0x26
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8003ba76>] page_referenced+0x43/0xe4
Sep 25 01:36:29 storageserver kernel:  [<ffffffff800c72b6>] shrink_inactive_list+0x191/0x7f9
Sep 25 01:36:29 storageserver kernel:  [<ffffffff800cbded>] page_referenced_one+0x61/0xd0
Sep 25 01:36:29 storageserver kernel:  [<ffffffff800cbd8e>] page_referenced_one+0x2/0xd0
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8003baa1>] page_referenced+0x6e/0xe4
Sep 25 01:36:29 storageserver kernel:  [<ffffffff80047ab0>] __pagevec_release+0x19/0x22
Sep 25 01:36:29 storageserver kernel:  [<ffffffff800c7004>] shrink_active_list+0x416/0x426
Sep 25 01:36:29 storageserver kernel:  [<ffffffff80012d02>] shrink_zone+0xf6/0x11c
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8009db21>] autoremove_wake_function+0x0/0x2e
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8005778a>] kswapd+0x337/0x45a
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8009db21>] autoremove_wake_function+0x0/0x2e
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8009d909>] keventd_create_kthread+0x0/0xc4
Sep 25 01:36:29 storageserver kernel:  [<ffffffff80057453>] kswapd+0x0/0x45a
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8009d909>] keventd_create_kthread+0x0/0xc4
Sep 25 01:36:29 storageserver kernel:  [<ffffffff80032360>] kthread+0xfe/0x132
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8009d909>] keventd_create_kthread+0x0/0xc4
Sep 25 01:36:29 storageserver kernel:  [<ffffffff80032262>] kthread+0x0/0x132
Sep 25 01:36:29 storageserver kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Sep 25 01:36:29 storageserver kernel:
Sep 25 01:36:34 storageserver tgtd: conn_close(101) connection closed, 0x2aac62810018 6
Sep 25 01:36:34 storageserver tgtd: conn_close(107) sesson 0x2aaccb010130 1
Sep 25 01:36:34 storageserver tgtd: conn_close(101) connection closed, 0x2aac705106b8 1
Sep 25 01:36:34 storageserver tgtd: conn_close(107) sesson 0x2ab383692e40 1
Sep 25 01:36:34 storageserver tgtd: conn_close(101) connection closed, 0x2aac47610678 1
Sep 25 01:36:34 storageserver tgtd: conn_close(107) sesson 0x2aac476108d0 1
Sep 25 01:36:34 storageserver tgtd: conn_close(101) connection closed, 0x2aaf60a92d88 120
Sep 25 01:36:34 storageserver tgtd: conn_close(107) sesson 0x2aac98110a30 1
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux SCSI]     [Linux RAID]     [Linux Clusters]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]

  Powered by Linux