Re: Kernel Crash on iSCSI volume create

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hey Bryant,

After spending more time reviewing fs/configfs code today, I've
convenienced myself this is a long standing configfs race between
config_group shutdown and symlink creation.

Have you been able to verify the particular scenario is in fact what's
happening..?

Hi Nick,

I have been testing, and trying to free up resources.

We only ran into the issue when we had 10 hosts + 1k + vms.

SO far no issue, ill let you know if we see anything.

-Bryant

On Tue, 2017-05-30 at 23:26 -0700, Nicholas A. Bellinger wrote:
On Fri, 2017-05-26 at 09:51 -0500, Bryant G. Ly wrote:
On 5/26/17 1:51 AM, Nicholas A. Bellinger wrote:

Hey Bryant,

On Wed, 2017-05-24 at 13:34 -0500, Bryant G. Ly wrote:
Hi Nick,

We are seeing the below kernel crash with a block backstore.

The steps to reproduce are: Discovered a new iSCSI volume, created a new ibmvscsis instance, mapped it.

Mmmm, based upon the INQUIRY string this looks like a PSCSI backstore,
yes..?
No, its a block backstore, we dont support PSCSI via our management console (Novalink).
I don't know why the inquiry string would show it as a PSCSI backstore.

Ah, was getting confused by the initiator INQUIRY strings..

So a iscsi initiator provided struct scsi_device is registered
with /sys/kernel/config/target/core/pscsi_*, and then mapped to a
ibmvscsis lun, right..?

Btw looking at 4.8.17, nothing has changed since Jan
in /drivers/target/, and certainly nothing related to configfs operation
well before that..

I assume this not a new regression, right..?
We just started testing with iSCSI, so we did not see it before.

[ 7872.359650] scsi host6: iSCSI Initiator over TCP/IP
[ 7872.621777] scsi 6:0:0:0: Direct-Access     IBM      2145             0000 PQ: 0 ANSI: 6
[ 7872.665105] sd 6:0:0:0: Attached scsi generic sg14 type 0
[ 7872.665600] sd 6:0:0:0: [sdg] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 7872.666986] sd 6:0:0:0: [sdg] Write Protect is off
[ 7872.666990] sd 6:0:0:0: [sdg] Mode Sense: 97 00 10 08
[ 7872.667691] sd 6:0:0:0: [sdg] Write cache: disabled, read cache: enabled, supports DPO and FUA
[ 7872.680928] sd 6:0:0:0: [sdg] Attached SCSI disk
[ 7877.289863] rpadlpar_io: slot U8247.22L.212A91A-V1-C8 added
[ 7879.893760] ------------[ cut here ]------------
[ 7879.893768] WARNING: CPU: 15 PID: 17585 at ./include/linux/kref.h:46 config_item_get+0x7c/0x90 [configfs]
[ 7879.893768] Modules linked in: target_core_pscsi target_core_file target_core_iblock rpadlpar_io rpaphp ip6table_raw xt_CT xt_mac xt_tcpudp xt_comment xt_physdev xt_set ip_set_hash_net ip_set iscsi_target_mod iptable_raw dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag openvswitch nf_nat_ipv6 libcrc32c target_core_user uio binfmt_misc pseries_rng ibmvmc(O) vmx_crypto rtc_generic xt_conntrack nf_nat_ftp nf_conntrack_ftp nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv6 nf_defrag_ipv6 nbd ipt_REJECT nf_reject_ipv4 ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_filter ip_tables ip6table_filter ip6_tables ebtables x_tables br_netfilter
[ 7879.893797]  bridge stp llc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 ses enclosure scsi_transport_sas ibmvscsis target_core_mod configfs ibmvscsi ibmveth(O) qla2xxx scsi_transport_fc ipr
[ 7879.893811] CPU: 15 PID: 17585 Comm: targetcli Tainted: G           O    4.8.17-customv2.22 #12
[ 7879.893812] task: c00000018a0d3400 task.stack: c0000001f3b40000
[ 7879.893813] NIP: d000000002c664ec LR: d000000002c60980 CTR: c000000000b70870
[ 7879.893814] REGS: c0000001f3b43810 TRAP: 0700   Tainted: G           O     (4.8.17-customv2.22)
[ 7879.893815] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28222242  XER: 00000000
[ 7879.893820] CFAR: d000000002c664bc SOFTE: 1
                  GPR00: d000000002c60980 c0000001f3b43a90 d000000002c70908 c0000000fbc06820
                  GPR04: c0000001ef1bd900 0000000000000004 0000000000000001 0000000000000000
                  GPR08: 0000000000000000 0000000000000001 d000000002c69560 d000000002c66d80
                  GPR12: c000000000b70870 c00000000e798700 c0000001f3b43ca0 c0000001d4949d40
                  GPR16: c00000014637e1c0 0000000000000000 0000000000000000 c0000000f2392940
                  GPR20: c0000001f3b43b98 0000000000000041 0000000000600000 0000000000000000
                  GPR24: fffffffffffff000 0000000000000000 d000000002c60be0 c0000001f1dac490
                  GPR28: 0000000000000004 0000000000000000 c0000001ef1bd900 c0000000f2392940
[ 7879.893839] NIP [d000000002c664ec] config_item_get+0x7c/0x90 [configfs]
[ 7879.893841] LR [d000000002c60980] check_perm+0x80/0x2e0 [configfs]
[ 7879.893842] Call Trace:
[ 7879.893844] [c0000001f3b43ac0] [d000000002c60980] check_perm+0x80/0x2e0 [configfs]
[ 7879.893847] [c0000001f3b43b10] [c000000000329770] do_dentry_open+0x2c0/0x460
[ 7879.893849] [c0000001f3b43b70] [c000000000344480] path_openat+0x210/0x1490
[ 7879.893851] [c0000001f3b43c80] [c00000000034708c] do_filp_open+0xfc/0x170
[ 7879.893853] [c0000001f3b43db0] [c00000000032b5bc] do_sys_open+0x1cc/0x390
[ 7879.893856] [c0000001f3b43e30] [c000000000009584] system_call+0x38/0xec
[ 7879.893856] Instruction dump:
[ 7879.893858] 409d0014 38210030 e8010010 7c0803a6 4e800020 3d220000 e94981e0 892a0000
[ 7879.893861] 2f890000 409effe0 39200001 992a0000 <0fe00000> 4bffffd0 60000000 60000000
[ 7879.893866] ---[ end trace 14078f0b3b5ad0aa ]---
[ 7912.992583] sd 6:0:0:0: [sdg] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 7912.992587] sd 6:0:0:0: [sdg] tag#0 Sense Key : Illegal Request [current]
[ 7912.992589] sd 6:0:0:0: [sdg] tag#0 Add. Sense: Logical unit not supported
[ 7912.992591] sd 6:0:0:0: [sdg] tag#0 CDB: Read(10) 28 00 00 00 00 00 00 00 80 00
[ 7912.992593] blk_update_request: I/O error, dev sdg, sector 0

The kref_get() on struct config_item->ci_kref in config_item_get() is
triggering a sanity check warning struct kref has reached zero, while a
new kref_get() is attempting to take a new reference.

Normally it means kref_get() should be using kref_get_unless_zero(), but
since this has never been reported before, it makes me think something
else outside of configfs proper is going on..

What configfs operations are triggering this warning, and what do the
configfs operations preceding it look like..?

Also, is this specific to PSCSI export or does it trigger independent of
target-core backend driver..?
They were attaching a iSCSI volume, so basically we do a discover/login and then targetcli
is called to do the mappings. Also between the attach there were a few ls queries.
All this is done via Novalink API(Virtualization management).

So the only configfs operations are through the mapping (backstores/block create) and the ls queries.

Ok, after doing further investigation in fs/configfs, it looks like the
config_item_get() triggering WARN_ON() here is most likely coming from
configfs_symlink() -> create_link().

So assuming this is the code-path in question, it looks like a backend
device is being released via:

    rmdir /sys/kernel/config/target/core/iblock_XXXX/$FOO

at the same time another process is attempting to symlink the same
configfs location via target_fabric_port_link() to a ibmvscsis port via:

    ln -s /sys/kernel/config/target/core/iblock_XXX/$FOO \
          /sys/kernel/config/target/ibmvscsis/$WWPN/$TPGT/lun/lun_XXX/$LINK

So let's first confirm this is in fact what's triggering the kref_get()
warning.  With CONFIG_DYNAMIC_DEBUG=y, and enabling full debug with:

    echo 'module target_core_mod +p' > /debug/dynamic_debug/control

should do the trick.

There is enough debug in target-core-mod already to
verify configfs_symlink() -> type->ct_item_ops->allow_link()
target_fabric_port_link() -> core_dev_add_lun() is what's getting
invoked, which happens immediately before create_link() ->
config_item_get() -> kref_get() gets called and hits the WARN_ON.

That said, I do see a race in fs/configfs/symlink.c:create_link() that
might explain what you've observed..

Looking at create_link(), the config_item_get() -> kref_get() for the
struct config_item (eg: se_dev_ci from target_fabric_port_link) symlink
target is attempted before the check for
target_sd->s_type & CONFIGFS_USET_DROPPING occurs, eg:

         sl = kmalloc(sizeof(struct configfs_symlink), GFP_KERNEL);
         if (sl) {
                 sl->sl_target = config_item_get(item);
                 spin_lock(&configfs_dirent_lock);
                 if (target_sd->s_type & CONFIGFS_USET_DROPPING) {
                         spin_unlock(&configfs_dirent_lock);
                         config_item_put(item);
                         kfree(sl);
                         return -ENOENT;
                 }
                 list_add(&sl->sl_list, &target_sd->s_links);
                 spin_unlock(&configfs_dirent_lock);
                 ret = configfs_create_link(sl, parent_item->ci_dentry,
                                            dentry);

which means configfs_rmdir() -> configfs_detach_prep() setting
CONFIGFS_USET_DROPPING and subsequently doing the final:

         /* Drop our reference from above */
         config_item_put(item);

could happen while config_item_get() is called from create_link(),
before the CONFIGFS_USET_DROPPING check occurs..

Here's a quick (untested) patch to close that potential race.

Please confirm with and without the following with debug enabled, and
let's see what we find.

diff --git a/fs/configfs/symlink.c b/fs/configfs/symlink.c
index a6ab012..c8aabba 100644
--- a/fs/configfs/symlink.c
+++ b/fs/configfs/symlink.c
@@ -83,14 +83,13 @@ static int create_link(struct config_item *parent_item,
         ret = -ENOMEM;
         sl = kmalloc(sizeof(struct configfs_symlink), GFP_KERNEL);
         if (sl) {
-               sl->sl_target = config_item_get(item);
                 spin_lock(&configfs_dirent_lock);
                 if (target_sd->s_type & CONFIGFS_USET_DROPPING) {
                         spin_unlock(&configfs_dirent_lock);
-                       config_item_put(item);
                         kfree(sl);
                         return -ENOENT;
                 }
+               sl->sl_target = config_item_get(item);
                 list_add(&sl->sl_list, &target_sd->s_links);
                 spin_unlock(&configfs_dirent_lock);
                 ret = configfs_create_link(sl, parent_item->ci_dentry,

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux