> -----Original Message----- > From: Sagi Grimberg [mailto:sagig@xxxxxxxxxxxxxxxxxx] > Sent: Tuesday, January 27, 2015 8:21 AM > To: Chris Moore; Nicholas A. Bellinger > Cc: target-devel; Sagi Grimberg; Or Gerlitz; Greg Kroah-Hartman > Subject: Re: [RFC-v3.10.y 5/8] iser-target: Parallelize CM connection > establishment > > On 1/27/2015 6:05 PM, Chris Moore wrote: > <SNIP> > > I haven't tried testing these against the latest 3.10. I tried > > applying them to the RHEL 7.1 Snap3 kernel but they wouldn't apply > > cleanly - too many other changes to 3.10 haven't been picked up yet by RH. > > > > I applied all the changes by hand to RHEL 7.1 and had no problems with > > booting, but the first time I try to login from the initiator I get a hang on the > target. > > rrr... > > > Can you share your log (with debug?). This is probably patches 5,6 which > might be better heading to 3.12+. > > Patches 1,2 are harmless obviously. > Patches 3,4,8 handle some issues in session teardown sequence. > Patches 5,6 handle issues in the login sequence (probably causing your > RHEL7.1 to hang...) > Patch 7 is a fix for bond failover test case, but won't work without patch 5 > applied. Log is below. Like I said, I applied all the changes by hand so I may have missed something. it looks like the lockup is happening when isert_connect_request () tries to grab the np_thread_lock. It may be confusion about cma_id->context. What should the context be? Should it be an isert_np, an isert_conn, or does it change depending on the situation? borneo login: [ 101.267481] BUG: soft lockup - CPU#2 stuck for 23s! [kworker/2:1:235] [ 101.273923] Modules linked in: target_core_pscsi target_core_file target_core_iblock rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache intel_powerclamp coretemp intel_rapl kvm_intel kvm xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt crct10dif_pclmul crc32_pclmul target_core_mod crc32c_intel ghash_clmulni_intel ib_srp scsi_transport_srp scsi_tgt ib_ipoib aesni_intel rdma_ucm ib_ucm lrw gf128mul ib_uverbs ib_umad sb_edac rdma_cm ib_cm glue_helper ablk_helper iw_cm ipmi_si ib_sa iTCO_wdt iTCO_vendor_support cryptd ipmi_msghandler edac_core wmi ib_mad pcspkr ocrdma ib_core mei_me ib_addr ntb mei i2c_i801 ioatdma lpc_ich mfd_core shpchp xfs libcrc32c sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper ttm isci drm igb libsas ptp ahci be2net libahci scsi_transport_sas pps_core libata dca vxlan i2c_algo_bit i2c_core ip_tunnel dm_mirror dm_region_hash dm_log dm_mod [ 101.359717] CPU: 2 PID: 235 Comm: kworker/2:1 Not tainted 3.10.0iser_patched #2 [ 101.367021] Hardware name: Supermicro X9SRW-F/X9SRW-F, BIOS 3.00 07/05/2013 [ 101.373978] Workqueue: ib_cm cm_work_handler [ib_cm] [ 101.378958] task: ffff880459d196c0 ti: ffff880459fd0000 task.ti: ffff880459fd0000 [ 101.386435] RIP: 0010:[<ffffffff8160b1a6>] [<ffffffff8160b1a6>] _raw_spin_lock_bh+0x36/0x50 [ 101.394887] RSP: 0018:ffff880459fd3c00 EFLAGS: 00000206 [ 101.400199] RAX: 0000000000000328 RBX: 0000000000000000 RCX: 0000000000008804 [ 101.407322] RDX: 000000000000fffe RSI: 000000000000fffe RDI: ffff880467b7fd54 [ 101.414449] RBP: ffff880459fd3c08 R08: ffff88045677c640 R09: 0000000000000000 [ 101.421580] R10: ffff88046f003a00 R11: 000000005c981f37 R12: ffff880459fd3bd8 [ 101.428704] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000 [ 101.435829] FS: 0000000000000000(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000 [ 101.443906] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 101.449644] CR2: 00007fa13015d270 CR3: 000000000190a000 CR4: 00000000000407e0 [ 101.456768] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 101.463892] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 101.471018] Stack: [ 101.473028] ffff88046504f400 ffff880459fd3c60 ffffffffa05a4e0d 01000014ffff0000 [ 101.480481] ffffffff813b7300 ffff880467b7fd00 ffff88045792b9e0 ffff880466b28e00 [ 101.487935] ffff880463c93800 ffff88046504f660 ffff88046504f400 0000000000000000 [ 101.495387] Call Trace: [ 101.497837] [<ffffffffa05a4e0d>] isert_cma_handler+0x14d/0xa20 [ib_isert] [ 101.504709] [<ffffffff813b7300>] ? get_random_bytes+0x20/0x30 [ 101.510541] [<ffffffffa03e7ef6>] cma_req_handler+0x3b6/0x710 [rdma_cm] [ 101.517155] [<ffffffffa042b3d5>] cm_process_work+0x25/0x140 [ib_cm] [ 101.523507] [<ffffffffa042bc17>] cm_req_handler+0x727/0xdd0 [ib_cm] [ 101.529859] [<ffffffffa042cae5>] cm_work_handler+0x185/0x1460 [ib_cm] [ 101.536386] [<ffffffff8108f0ab>] process_one_work+0x17b/0x470 [ 101.542217] [<ffffffff8108fe8b>] worker_thread+0x11b/0x400 [ 101.547781] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400 [ 101.553604] [<ffffffff8109726f>] kthread+0xcf/0xe0 [ 101.558475] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140 [ 101.564995] [<ffffffff81613d3c>] ret_from_fork+0x7c/0xb0 [ 101.570393] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140 [ 101.576918] Code: 89 fb e8 1e b5 a6 ff b8 00 00 02 00 f0 0f c1 03 89 c2 c1 ea 10 66 39 c2 75 03 5b 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 0f b7 0b <66> 39 ca 74 ea f3 90 83 e8 01 75 f1 48 89 df 66 66 66 90 66 66 ��.n��������+%������w��{.n����j�����{ay�ʇڙ���f���h������_�(�階�ݢj"��������G����?���&��