Re: GPF from target_put_nacl()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jul 30, 2017 at 5:15 PM, Nicholas A. Bellinger
<nab@xxxxxxxxxxxxxxx> wrote:
> On Thu, 2017-07-27 at 13:56 -0700, Justin Maggard wrote:
>> I ran across a GPF after disabling demo-mode on an iSCSI target.  The
>> backtrace pointed to target_put_nacl(), which was called from
>> transport_free_session().
>
> Can we see actual stack trace, and kernel version please..?
>

Sure.

kernel: general protection fault: 0000 [#1] SMP
kernel: Modules linked in:
kernel: CPU: 0 PID: 11047 Comm: iscsi_ttx Not tainted 4.13.0-rc2.x86_64.1+ #20
kernel: Hardware name: Intel Corporation S5500BC/S5500BC, BIOS
S5500.86B.01.00.0064.050520141428 05/05/2014
kernel: task: ffff88026939e800 task.stack: ffffc90007884000
kernel: RIP: 0010:target_put_nacl+0x49/0xb0
kernel: RSP: 0018:ffffc90007887d70 EFLAGS: 00010246
kernel: RAX: dead000000000200 RBX: ffff8802556ca000 RCX: 0000000000000000
kernel: RDX: dead000000000100 RSI: 0000000000000246 RDI: ffff8802556ce028
kernel: RBP: ffffc90007887d88 R08: 0000000000000001 R09: 0000000000000000
kernel: R10: ffffc90007887df8 R11: ffffea0009986900 R12: ffff8802556ce020
kernel: R13: ffff8802556ce028 R14: ffff8802556ce028 R15: ffffffff88d85540
kernel: FS:  0000000000000000(0000) GS:ffff88027fc00000(0000)
knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007fffe36f5f94 CR3: 0000000009209000 CR4: 00000000003406f0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kernel: Call Trace:
kernel:  transport_free_session+0x67/0x140
kernel:  transport_deregister_session+0x7a/0xc0
kernel:  iscsit_close_session+0x92/0x210
kernel:  iscsit_close_connection+0x5f9/0x840
kernel:  iscsit_take_action_for_connection_exit+0xfe/0x110
kernel:  iscsi_target_tx_thread+0x140/0x1e0
kernel:  ? wait_woken+0x90/0x90
kernel:  kthread+0x124/0x160
kernel:  ? iscsit_thread_get_cpumask+0x90/0x90
kernel:  ? kthread_create_on_node+0x40/0x40
kernel:  ret_from_fork+0x22/0x30
kernel: Code: 00 48 89 fb 4c 8b a7 48 01 00 00 74 68 4d 8d 6c 24 08 4c
89 ef e8 e8 28 43 00 48 8b 93 20 04 00 00 48 8b 83 28 04 00 00 4c 89
ef <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 20
kernel: RIP: target_put_nacl+0x49/0xb0 RSP: ffffc90007887d70
kernel: ---[ end trace f12821adbfd46fed ]---

>> A look at that function shows a pretty clear problem:
>>
>>    543                  mutex_unlock(&se_tpg->acl_node_mutex);
>>    544
>>    545                  if (se_nacl->dynamic_stop)
>>    546                          target_put_nacl(se_nacl);
>>    547
>>    548                  target_put_nacl(se_nacl);
>>    549          }
>>
>> So we make two calls to target_put_nacl() when se_nacl->dynamic_stop is true.
>
> Correct.
>
>> This got introduced by 01d4d6735589 (target: Fix multi-session dynamic
>> se_node_acl double free OOPs).  It looks to me like we can just
>> eliminate that first call, but I could easily be missing something.
>
> The double put in transport_free_session() when se_nacl->dynamic_stop is
> set is expected behavior, because when a dynamic se_node_acl is
> generated it takes the initial reference in target_alloc_node_acl(), and
> then a second reference in core_tpg_check_initiator_node_acl():
>
>        /*
>          * When allocating a dynamically generated node_acl, go ahead
>          * and take the extra kref now before returning to the fabric
>          * driver caller.
>          *
>          * Note this reference will be released at session shutdown
>          * time within transport_free_session() code.
>          */
>         kref_get(&acl->acl_kref);
>         acl->dynamic_node_acl = 1;
>
> So for se_node_acl->dynamic_node_acl = 0 in transport_free_session(),
> it's just dropping the se_node_acl->nacl_kref obtained by
> core_tpg_get_initiator_node_acl().  Eg, it's not dropping the last
> reference..
>
> But during the se_node_acl->dynamic_node_acl = 1 free special case,
> transport_free_session() is responsible for both target_put_nacl() calls
> to actually free se_node_acl.
>
> That is, normally for non dynamic ACLs, it's the configfs callback for
> se_node_acl->acl_group rmdir into core_tpg_del_initiator_node_acl() that
> drops the reference:
>
>         target_put_nacl(acl);
>         /*
>          * Wait for last target_put_nacl() to complete in target_complete_nacl()
>          * for active fabric session transport_deregister_session() callbacks.
>          */
>         wait_for_completion(&acl->acl_free_comp);
>
> But since se_node_acl->dynamic_node_acl = 1 doesn't have a corresponding
> configfs entry for user-space to remove, it's left up to
> transport_free_session() to set se_node_acl->dynamic_stop and drop both
> references to free se_node_acl.
>

Thanks a lot, that clears up my misunderstanding.  I should have read
the kref documentation carefully first.

So it looks like what's *really* happening is,
transport_free_session() does a list_del(acl_list). At least in my
case, this empties the list. Then, target_put_nacl() calls
target_complete_nacl(), which also does a list_del(acl_list) and
crashes.

-Justin
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux