Re: Continuously crashes in kernel 4.5.2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nicholas, and thanks for your reply.
I did realize that ERL=2 was causing troubles and I switched to ERL=0
as you suggested.
Anyway after many days of spotless service (now I have switched to Linux 4.6.0),
I got again this in dmesg (this time the kernel didn't crash):

[Tue Jun 14 18:23:49 2016] Received CmdSN: 0x02694b50 is less than
ExpCmdSN: 0x02694b52, ignoring.
[Tue Jun 14 18:23:49 2016] iscsi_trx: page allocation failure:
order:3, mode:0x208c020(GFP_ATOMIC|__GFP_COMP|__GFP_ZERO)
[Tue Jun 14 18:23:49 2016] CPU: 1 PID: 13872 Comm: iscsi_trx Not
tainted 4.6.0-gentoo #1
[Tue Jun 14 18:23:49 2016] Hardware name: Dell Inc. PowerEdge
T20/0VD5HY, BIOS A01 08/13/2013
[Tue Jun 14 18:23:49 2016]  0000000000000000 ffffffff812b8d18
0000000000000000 ffff8801a4dfbc40
[Tue Jun 14 18:23:49 2016]  ffffffff810d3d7d 0208c0201edf8838
fffffffffffffff8 0000000000000000
[Tue Jun 14 18:23:49 2016]  ffff88021edd8800 0000000000000000
ffff88021edf8848 0000000000000008
[Tue Jun 14 18:23:49 2016] Call Trace:
[Tue Jun 14 18:23:49 2016]  [<ffffffff812b8d18>] ? dump_stack+0x46/0x59
[Tue Jun 14 18:23:49 2016]  [<ffffffff810d3d7d>] ? warn_alloc_failed+0x117/0x137
[Tue Jun 14 18:23:49 2016]  [<ffffffff810d6203>] ?
__alloc_pages_nodemask+0x895/0x939
[Tue Jun 14 18:23:49 2016]  [<ffffffff8106afdf>] ? print_time.part.12+0x4f/0x52
[Tue Jun 14 18:23:49 2016]  [<ffffffff81106b04>] ? alloc_pages_current+0xb1/0xd3
[Tue Jun 14 18:23:49 2016]  [<ffffffff810e874c>] ? kmalloc_order+0xf/0x3a
[Tue Jun 14 18:23:49 2016]  [<ffffffff810e89ed>] ? kmalloc_order_trace+0x19/0x8d
[Tue Jun 14 18:23:49 2016]  [<ffffffffa023e6b8>] ?
iscsit_dump_data_payload+0x50/0x181 [iscsi_target_mod]
[Tue Jun 14 18:23:49 2016]  [<ffffffffa024a5aa>] ?
iscsi_target_rx_thread+0xa48/0xa98 [iscsi_target_mod]
[Tue Jun 14 18:23:49 2016]  [<ffffffff81011684>] ? __switch_to+0x157/0x392
[Tue Jun 14 18:23:49 2016]  [<ffffffff8105f3b5>] ? dequeue_task_fair+0x163/0x1da
[Tue Jun 14 18:23:49 2016]  [<ffffffffa0249b62>] ?
iscsi_target_tx_thread+0x1a5/0x1a5 [iscsi_target_mod]
[Tue Jun 14 18:23:49 2016]  [<ffffffff81051a2b>] ? kthread+0x95/0x9d
[Tue Jun 14 18:23:49 2016]  [<ffffffff81538ed2>] ? ret_from_fork+0x22/0x40
[Tue Jun 14 18:23:49 2016]  [<ffffffff81051996>] ? init_completion+0x1d/0x1d
[Tue Jun 14 18:23:49 2016] Mem-Info:
[Tue Jun 14 18:23:49 2016] active_anon:196 inactive_anon:326 isolated_anon:0
                            active_file:877143 inactive_file:893182
isolated_file:0
                            unevictable:3780 dirty:9262 writeback:0 unstable:0
                            slab_reclaimable:46676 slab_unreclaimable:33780
                            mapped:1708 shmem:2 pagetables:236 bounce:0
                            free:40125 free_pcp:418 free_cma:0
[Tue Jun 14 18:23:49 2016] Node 0 DMA free:15852kB min:128kB low:160kB
high:192kB active_anon:0kB inactive_anon:0kB active_file:0kB
inactive_file:0kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:15936kB managed:15852kB mlocked:0kB
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB
pages_scanned:0 all_unreclaimable? yes
[Tue Jun 14 18:23:49 2016] lowmem_reserve[]: 0 3414 7875 7875
[Tue Jun 14 18:23:49 2016] Node 0 DMA32 free:88792kB min:28384kB
low:35480kB high:42576kB active_anon:364kB inactive_anon:240kB
active_file:1509984kB inactive_file:1530632kB unevictable:9528kB
isolated(anon):0kB isolated(file):0kB present:3578504kB
managed:3502092kB mlocked:9528kB dirty:8156kB writeback:0kB
mapped:4052kB shmem:8kB slab_reclaimable:80668kB
slab_unreclaimable:78324kB kernel_stack:672kB pagetables:568kB
unstable:0kB bounce:0kB free_pcp:620kB local_pcp:0kB free_cma:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[Tue Jun 14 18:23:49 2016] lowmem_reserve[]: 0 0 4460 4460
[Tue Jun 14 18:23:49 2016] Node 0 Normal free:55856kB min:37020kB
low:46272kB high:55524kB active_anon:420kB inactive_anon:1064kB
active_file:1998588kB inactive_file:2042096kB unevictable:5592kB
isolated(anon):0kB isolated(file):0kB present:4700160kB
managed:4568028kB mlocked:5592kB dirty:28892kB writeback:0kB
mapped:2780kB shmem:0kB slab_reclaimable:106036kB
slab_unreclaimable:56796kB kernel_stack:2016kB pagetables:376kB
unstable:0kB bounce:0kB free_pcp:1052kB local_pcp:424kB free_cma:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[Tue Jun 14 18:23:49 2016] lowmem_reserve[]: 0 0 0 0
[Tue Jun 14 18:23:49 2016] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB
1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U)
1*2048kB (M) 3*4096kB (M) = 15852kB
[Tue Jun 14 18:23:49 2016] Node 0 DMA32: 3918*4kB (UME) 8928*8kB (U)
115*16kB (UM) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 88936kB
[Tue Jun 14 18:23:49 2016] Node 0 Normal: 6115*4kB (UE) 3932*8kB (UM)
0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 55916kB
[Tue Jun 14 18:23:49 2016] Node 0 hugepages_total=0 hugepages_free=0
hugepages_surp=0 hugepages_size=2048kB
[Tue Jun 14 18:23:49 2016] 1771862 total pagecache pages
[Tue Jun 14 18:23:49 2016] 92 pages in swap cache
[Tue Jun 14 18:23:49 2016] Swap cache stats: add 4926, delete 4834,
find 802213/803440
[Tue Jun 14 18:23:49 2016] Free swap  = 10478708kB
[Tue Jun 14 18:23:49 2016] Total swap = 10485756kB
[Tue Jun 14 18:23:49 2016] 2073650 pages RAM
[Tue Jun 14 18:23:49 2016] 0 pages HighMem/MovableOnly
[Tue Jun 14 18:23:49 2016] 52157 pages reserved
[Tue Jun 14 18:23:49 2016] Unable to allocate 32768 bytes for offload buffer.
[Tue Jun 14 18:48:46 2016] Received CmdSN: 0x02696f4e is less than
ExpCmdSN: 0x02696f4f, ignoring.
[Tue Jun 14 18:48:46 2016] iscsi_trx: page allocation failure:
order:3, mode:0x208c020(GFP_ATOMIC|__GFP_COMP|__GFP_ZERO)
[Tue Jun 14 18:48:46 2016] CPU: 1 PID: 14614 Comm: iscsi_trx Not
tainted 4.6.0-gentoo #1
[Tue Jun 14 18:48:46 2016] Hardware name: Dell Inc. PowerEdge
T20/0VD5HY, BIOS A01 08/13/2013
[Tue Jun 14 18:48:46 2016]  0000000000000000 ffffffff812b8d18
0000000000000000 ffff880098e63c40
[Tue Jun 14 18:48:46 2016]  ffffffff810d3d7d 0208c0201edf8838
fffffffffffffff8 0000000000000000
[Tue Jun 14 18:48:46 2016]  ffff88021edd8800 0000000000000000
ffff88021edf8848 0000000000000008
[Tue Jun 14 18:48:46 2016] Call Trace:
[Tue Jun 14 18:48:46 2016]  [<ffffffff812b8d18>] ? dump_stack+0x46/0x59
[Tue Jun 14 18:48:46 2016]  [<ffffffff810d3d7d>] ? warn_alloc_failed+0x117/0x137
[Tue Jun 14 18:48:46 2016]  [<ffffffff810d6203>] ?
__alloc_pages_nodemask+0x895/0x939
[Tue Jun 14 18:48:46 2016]  [<ffffffff8106afdf>] ? print_time.part.12+0x4f/0x52
[Tue Jun 14 18:48:46 2016]  [<ffffffff81106b04>] ? alloc_pages_current+0xb1/0xd3
[Tue Jun 14 18:48:46 2016]  [<ffffffff810e874c>] ? kmalloc_order+0xf/0x3a
[Tue Jun 14 18:48:46 2016]  [<ffffffff810e89ed>] ? kmalloc_order_trace+0x19/0x8d
[Tue Jun 14 18:48:46 2016]  [<ffffffffa023e6b8>] ?
iscsit_dump_data_payload+0x50/0x181 [iscsi_target_mod]
[Tue Jun 14 18:48:46 2016]  [<ffffffffa024a5aa>] ?
iscsi_target_rx_thread+0xa48/0xa98 [iscsi_target_mod]
[Tue Jun 14 18:48:46 2016]  [<ffffffff81011684>] ? __switch_to+0x157/0x392
[Tue Jun 14 18:48:46 2016]  [<ffffffff8105f3b5>] ? dequeue_task_fair+0x163/0x1da
[Tue Jun 14 18:48:46 2016]  [<ffffffffa0249b62>] ?
iscsi_target_tx_thread+0x1a5/0x1a5 [iscsi_target_mod]
[Tue Jun 14 18:48:46 2016]  [<ffffffff81051a2b>] ? kthread+0x95/0x9d
[Tue Jun 14 18:48:46 2016]  [<ffffffff81538ed2>] ? ret_from_fork+0x22/0x40
[Tue Jun 14 18:48:46 2016]  [<ffffffff81051996>] ? init_completion+0x1d/0x1d
[Tue Jun 14 18:48:46 2016] Mem-Info:
[Tue Jun 14 18:48:46 2016] active_anon:164 inactive_anon:360 isolated_anon:0
                            active_file:871506 inactive_file:898134
isolated_file:32
                            unevictable:3780 dirty:9939 writeback:8 unstable:0
                            slab_reclaimable:46598 slab_unreclaimable:34865
                            mapped:1769 shmem:2 pagetables:236 bounce:0
                            free:39584 free_pcp:609 free_cma:0
[Tue Jun 14 18:48:46 2016] Node 0 DMA free:15852kB min:128kB low:160kB
high:192kB active_anon:0kB inactive_anon:0kB active_file:0kB
inactive_file:0kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:15936kB managed:15852kB mlocked:0kB
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB
pages_scanned:0 all_unreclaimable? yes
[Tue Jun 14 18:48:46 2016] lowmem_reserve[]: 0 3414 7875 7875
[Tue Jun 14 18:48:46 2016] Node 0 DMA32 free:88564kB min:28384kB
low:35480kB high:42576kB active_anon:236kB inactive_anon:368kB
active_file:1504596kB inactive_file:1546616kB unevictable:9528kB
isolated(anon):0kB isolated(file):0kB present:3578504kB
managed:3502092kB mlocked:9528kB dirty:10140kB writeback:16kB
mapped:4052kB shmem:8kB slab_reclaimable:80484kB
slab_unreclaimable:80736kB kernel_stack:688kB pagetables:568kB
unstable:0kB bounce:0kB free_pcp:1300kB local_pcp:640kB free_cma:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[Tue Jun 14 18:48:46 2016] lowmem_reserve[]: 0 0 4460 4460
[Tue Jun 14 18:48:46 2016] Node 0 Normal free:54036kB min:37020kB
low:46272kB high:55524kB active_anon:420kB inactive_anon:1072kB
active_file:1981428kB inactive_file:2045792kB unevictable:5592kB
isolated(anon):0kB isolated(file):128kB present:4700160kB
managed:4568028kB mlocked:5592kB dirty:29616kB writeback:16kB
mapped:3024kB shmem:0kB slab_reclaimable:105908kB
slab_unreclaimable:58724kB kernel_stack:2016kB pagetables:376kB
unstable:0kB bounce:0kB free_pcp:1140kB local_pcp:428kB free_cma:0kB
writeback_tmp:0kB pages_scanned:128 all_unreclaimable? no
[Tue Jun 14 18:48:46 2016] lowmem_reserve[]: 0 0 0 0
[Tue Jun 14 18:48:46 2016] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB
1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U)
1*2048kB (M) 3*4096kB (M) = 15852kB
[Tue Jun 14 18:48:46 2016] Node 0 DMA32: 8577*4kB (UME) 5795*8kB (UM)
496*16kB (UM) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 88604kB
[Tue Jun 14 18:48:46 2016] Node 0 Normal: 9263*4kB (UE) 2116*8kB (U)
13*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 54188kB
[Tue Jun 14 18:48:46 2016] Node 0 hugepages_total=0 hugepages_free=0
hugepages_surp=0 hugepages_size=2048kB
[Tue Jun 14 18:48:46 2016] 1771195 total pagecache pages
[Tue Jun 14 18:48:46 2016] 92 pages in swap cache
[Tue Jun 14 18:48:46 2016] Swap cache stats: add 4928, delete 4836,
find 802888/804117
[Tue Jun 14 18:48:46 2016] Free swap  = 10478716kB
[Tue Jun 14 18:48:46 2016] Total swap = 10485756kB
[Tue Jun 14 18:48:46 2016] 2073650 pages RAM
[Tue Jun 14 18:48:46 2016] 0 pages HighMem/MovableOnly
[Tue Jun 14 18:48:46 2016] 52157 pages reserved
[Tue Jun 14 18:48:46 2016] Unable to allocate 32768 bytes for offload buffer.

N.B. I have these settings in sysctl.conf (as suggested somewhere):
------
vm.min_free_kbytes = 65536

net.core.rmem_max = 1048576
net.core.rmem_default = 1048576
net.core.wmem_max = 1048576
net.core.wmem_default = 1048576
net.ipv4.tcp_mem = 1048576 1048576 1048576
net.ipv4.tcp_rmem =  1048576 1048576 1048576
net.ipv4.tcp_wmem = 1048576 1048576
------

I hope this report turns out useful,
Thanks for your work,
Edoardo


On Wed, May 18, 2016 at 9:34 AM, Nicholas A. Bellinger
<nab@xxxxxxxxxxxxxxx> wrote:
> Hi Edoardo,
>
> Apologies for the delayed follow up on your bug report.
>
> Comments inline below.
>
> On Wed, 2016-05-04 at 10:38 +0200, Edoardo wrote:
>> Hi all,
>> I'm having troubles in my iSCSI target server after the update to
>> linux-4.5.2. I've always had some trouble, but not this bad.
>> This time the kernel simply crashes printing out “fixing recursive
>> fault but reboot is needed”, but reboot is actually the only thing I
>> can do.
>> Fortunately I was able to catch the dmesg output thanks to a remote
>> syslog server.
>> Can you help me sort this out?
>> I'm also testing on btrfs filesystem, so some troubles may come from there.
>>
>> I attached my saved targetcli configuration, and paste the info and
>> the dmesg output
>>
>> uname -a :
>> Linux gentoo-SMB1 4.5.2-gentoo #2 SMP Tue Apr 26 11:36:10 CEST 2016
>> x86_64 Intel(R) Pentium(R) CPU G3220 @ 3.00GHz GenuineIntel GNU/Linux
>>
>
> In your lio_start.sh, all /sys/kernel/config/target/iscsi/$IQN/$TPGT/
> endpoints are changing parameter defaults ErrorRecoveryLevel=0 to
> ErrorRecoveryLevel=2 for the two active iscsi-target exports.
>
> The reason we default to ERL=0 is because MSFT initiators have long had
> problems not following the ERL=2's connection recovery state machine in
> RFC-3720, resulting in hung scsi miniport I/O and other MSFT host side
> issues.
>
> Of course, the type of memory allocation failure you've observed below
> should not be triggering a target OOPsen, but for getting stable
> MSFT iSCSI host setup you really need to be using ERL=0 defaults for all
> exports.
>
>> [ 1154.103989] ignoring deprecated emulate_fua_read attribute
>> [ 1154.104021] ignoring deprecated emulate_dpo attribute
>> [ 1709.899686] Unable to locate ITT: 0xad891600 on CID: 1, dumping payload
>> [ 1709.899750] Unable to locate ITT: 0xae891600 on CID: 1, dumping payload
>> [ 1709.899792] Unable to locate ITT: 0xaf891600 on CID: 1, dumping payload
>> [ 1709.899841] Unable to locate ITT: 0xb0891600 on CID: 1, dumping payload
>> [ 1709.899856] Unable to locate ITT: 0xb1891600 on CID: 1, dumping payload
>> [ 1710.138608] Unable to locate ITT: 0xb2891600 on CID: 1, dumping payload
>> [ 1714.873446] Unable to locate ITT: 0x5b8a1600 on CID: 1, dumping payload
>> [ 1714.873513] Unable to locate ITT: 0x5c8a1600 on CID: 1, dumping payload
>> [ 1714.873644] Unable to locate ITT: 0x5d8a1600 on CID: 1, dumping payload
>> [ 1714.873689] Unable to locate ITT: 0x5e8a1600 on CID: 1, dumping payload
>> [ 1714.876817] Unable to locate ITT: 0x5f8a1600 on CID: 1, dumping payload
>> [ 1752.823610] Unable to locate ITT: 0xfd8e1600 on CID: 1, dumping payload
>> [ 1774.958443] Unable to locate ITT: 0x07911600 on CID: 1, dumping payload
>> [ 1774.960190] Unable to locate ITT: 0x08911600 on CID: 1, dumping payload
>> [ 1774.961896] Unable to locate ITT: 0x09911600 on CID: 1, dumping payload
>> [ 1774.963396] Unable to locate ITT: 0x0a911600 on CID: 1, dumping payload
>> [ 1774.965143] Unable to locate ITT: 0x0b911600 on CID: 1, dumping payload
>> [ 1774.966932] Unable to locate ITT: 0x0c911600 on CID: 1, dumping payload
>> [ 1774.968395] Unable to locate ITT: 0x0d911600 on CID: 1, dumping payload
>> [ 1864.857610] Unable to locate ITT: 0x54951600 on CID: 1, dumping payload
>> [ 1864.859108] Unable to locate ITT: 0x55951600 on CID: 1, dumping payload
>> [ 1868.897251] Unable to locate ITT: 0x73951600 on CID: 1, dumping payload
>> [ 1872.249432] Unable to locate ITT: 0x90951600 on CID: 1, dumping payload
>> [ 1872.249436] iscsi_trx: page allocation failure: order:3, mode:0x208c020
>> [ 1872.249439] CPU: 0 PID: 5031 Comm: iscsi_trx Not tainted 4.5.2-gentoo #2
>> [ 1872.249440] Hardware name: Dell Inc. PowerEdge T20/0VD5HY, BIOS A01 08/13/2013
>
> <SNIP>
>
>> [ 1872.249556] Unable to allocate 32768 bytes for offload buffer.
>> [ 1872.249559] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249560] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249561] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249563] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249564] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249565] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249566] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249567] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249568] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249569] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249570] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249571] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249573] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249574] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249576] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249577] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249578] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249579] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249583] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249584] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249585] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249586] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249587] NopOUT Flag's, Left Most Bit not set, protocol error.
>> [ 1872.249588] Got unknown iSCSI OpCode: 0x5b
>> [ 1872.249590] Unable to recover from unknown opcode while OFMarker=No, closing iSCSI connection.
>> [ 1903.106918] Unable to locate ITT: 0xea961600 on CID: 1, dumping payload
>> [ 1903.106974] Unable to locate ITT: 0xeb961600 on CID: 1, dumping payload
>> [ 1903.107017] Unable to locate ITT: 0xec961600 on CID: 1, dumping payload
>> [ 1903.107054] Unable to locate ITT: 0xed961600 on CID: 1, dumping payload
>> [ 1915.539126] Unable to locate ITT: 0x90971600 on CID: 1, dumping payload
>> [ 1915.539132] iscsi_trx: page allocation failure: order:3, mode:0x208c020
>> [ 1915.539135] CPU: 0 PID: 5284 Comm: iscsi_trx Not tainted 4.5.2-gentoo #2
>> [ 1915.539136] Hardware name: Dell Inc. PowerEdge T20/0VD5HY, BIOS A01 08/13/2013
>
> <SNIP>
>
>> [ 1915.539269] Unable to allocate 32768 bytes for offload buffer.
>
> It's strange that smallish order:3 memory allocations begin to fail this
> early..
>
>> [ 1915.539271] Got unknown iSCSI OpCode: 0x33
>> [ 1915.539272] Unable to recover from unknown opcode while OFMarker=No, closing iSCSI connection.
>> [ 1915.547337] BUG: unable to handle kernel paging request at ffffc900017e4100
>> [ 1915.547478] IP: [<ffffffffa03cfbf4>] iscsit_free_connection_recovery_entires+0x1d9/0x278 [iscsi_target_mod]
>> [ 1915.547592] PGD 21608a067 PUD 21608b067 PMD d9520067 PTE 0
>> [ 1915.547805] Oops: 0000 [#1] SMP
>> [ 1915.547933] Modules linked in: tcm_loop iscsi_target_mod
>> target_core_pscsi target_core_file target_core_iblock target_core_mod
>> kvm_intel kvm irqbypass crc32c_intel e1000e [last unloaded:
>> target_core_mod]
>> [ 1915.548539] CPU: 1 PID: 4150 Comm: kworker/1:1 Not tainted 4.5.2-gentoo #2
>> [ 1915.548597] Hardware name: Dell Inc. PowerEdge T20/0VD5HY, BIOS A01 08/13/2013
>> [ 1915.548662] Workqueue: events iscsi_target_do_login_rx [iscsi_target_mod]
>> [ 1915.548759] task: ffff8801a6d1bf00 ti: ffff8801d95d4000 task.ti: ffff8801d95d4000
>> [ 1915.548817] RIP: 0010:[<ffffffffa03cfbf4>]  [<ffffffffa03cfbf4>] iscsit_free_connection_recovery_entires+0x1d9/0x278 [iscsi_target_mod]
>> [ 1915.548933] RSP: 0018:ffff8801d95d7c98  EFLAGS: 00010246
>> [ 1915.548987] RAX: ffff8800b68ad858 RBX: ffff8801c96e3000 RCX: ffffc900017e4100
>> [ 1915.549043] RDX: 0000000000000001 RSI: 00000001800c000b RDI: ffff8800b68ad868
>> [ 1915.549100] RBP: ffff8801c96e3070 R08: 0000000000000001 R09: ffffffffa03db052
>> [ 1915.549197] R10: ffffea0000b54000 R11: 0000000000100001 R12: ffff8801c96e3110
>> [ 1915.549294] R13: ffff8800b68ad868 R14: ffff8800b68ad840 R15: ffffc900017e3f00
>> [ 1915.549392] FS:  0000000000000000(0000) GS:ffff88021eb00000(0000) knlGS:0000000000000000
>> [ 1915.549531] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 1915.549625] CR2: ffffc900017e4100 CR3: 00000000c0ad1000 CR4: 00000000000406a0
>> [ 1915.549721] Stack:
>> [ 1915.549806]  ffff8800b68ad858 ffff8801c96e30f8 ffff880211a96000 ffff8801c96e3000
>> [ 1915.550103]  ffff880061b4f050 ffff880061b4f000 ffff880061b4f050 ffff880214ec99c0
>> [ 1915.550398]  ffff880211a96000 ffffffffa03db06a ffffffffa038845c ffff88002d500a74
>> [ 1915.550695] Call Trace:
>> [ 1915.550788]  [<ffffffffa03db06a>] ? iscsit_close_session+0xe5/0x164 [iscsi_target_mod]
>> [ 1915.550936]  [<ffffffffa038845c>] ? atomic_dec_mb+0x4/0x4 [target_core_mod]
>> [ 1915.551039]  [<ffffffffa0388bf5>] ? kref_put+0x2e/0x36 [target_core_mod]
>> [ 1915.551142]  [<ffffffffa03d0676>] ? iscsi_check_for_session_reinstatement+0x1bf/0x1d0 [iscsi_target_mod]
>> [ 1915.551290]  [<ffffffffa03d24d4>] ? iscsi_target_do_login+0x2f9/0x4e8 [iscsi_target_mod]
>> [ 1915.551434]  [<ffffffffa03d315f>] ? iscsi_target_do_login_rx+0x165/0x1e9 [iscsi_target_mod]
>> [ 1915.551578]  [<ffffffffa03d1e67>] ? iscsi_target_restore_sock_callbacks+0x8a/0x8a [iscsi_target_mod]
>> [ 1915.551722]  [<ffffffff8104c608>] ? process_one_work+0x19c/0x2b9
>> [ 1915.551818]  [<ffffffff8104ca5d>] ? worker_thread+0x1d9/0x2ad
>> [ 1915.551912]  [<ffffffff8104c884>] ? cancel_delayed_work_sync+0xa/0xa
>> [ 1915.552010]  [<ffffffff810507b7>] ? kthread+0x95/0x9d
>> [ 1915.552104]  [<ffffffff81050722>] ? kthread_parkme+0x16/0x16
>> [ 1915.552204]  [<ffffffff81538d5f>] ? ret_from_fork+0x3f/0x70
>> [ 1915.552293]  [<ffffffff81050722>] ? kthread_parkme+0x16/0x16
>> [ 1915.552382] Code: 86 90 00 00 00 c6 83 10 01 00 00 00 4d 8d 6e 28
>> 4c 89 ef e8 2e 8c 16 e1 49 8b 4e 18 49 8d 46 18 48 89 04 24 4c 8d b9
>> 00 fe ff ff <48> 8b 09 48 81 e9 00 02 00 00 49 8d bf 00 02 00 00 48 3b
>> 3c 24
>> [ 1915.555105] RIP  [<ffffffffa03cfbf4>] iscsit_free_connection_recovery_entires+0x1d9/0x278 [iscsi_target_mod]
>> [ 1915.555283]  RSP <ffff8801d95d7c98>
>> [ 1915.555367] CR2: ffffc900017e4100
>> [ 1915.555451] ---[ end trace 405a6f5266a8bf99 ]---
>>
>
> Ok, so after dumping repeated TCP payloads and two memory allocation
> failures, iscsi login eventually hits a kernel paging OOPs after
> subsequent ERL=2 connection login failure occurs.
>
> I still need to ponder the scenario for a proper bug-fix might work, but
> the issue itself is ERL=2 specific and does not effect ERL=0 operation.
>
> So as a work-around, go ahead and change to ERL=0 in your lio_start.sh
> for both IQN+TargetPortalGroupTag endpoints.  Note you'll need to force
> reconnect all MSFT sessions after changing this configfs attribute for
> the new values to take effect.
>
>>
>> Thanks for the help,
>> --
>> Edoardo Liverani
>> --
>
> Thanks for reporting!
>
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux