RE: iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Max,

I have tried the patch, but no luck. Issue is still seen.

-Raju

-----Original Message-----
From: Max Gurtovoy [mailto:maxg@xxxxxxxxxxxx] 
Sent: 01 February 2017 20:48
To: Raju Rangoju <rajur@xxxxxxxxxxx>; Sagi Grimberg <sagi@xxxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx
Cc: SWise OGC <swise@xxxxxxxxxxxxxxxxxxxxx>; Potnuri Bharat Teja <bharat@xxxxxxxxxxx>
Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress

hi Raju,
please apply the attached patch I want to push soon (still haven't find the chance to test it).
I'm not sure it will solve your problem but let's try it.

thanks,
Max.

On 2/1/2017 11:08 AM, Raju  Rangoju wrote:
>
> Hello Sagi,
>
> I intermittently see an issue with iser when unloading the iw_cxgb4 module while traffic is running. Apparently the rdma resources are not getting released when the iser receives RDMA_CM_EVENT_DEVICE_REMOVAL event while the IO in progress. iser_cma_handler() upon receiving the DEVICE_REMOVAL event, destroys the device by calling iser_cleanup_handler(). iser_free_ib_conn_res() destroys the qp and calls iser_free_fastreg_pool() to free the Memory Regions in the fastreg_pool list, and then it calls ib_dealloc_pd.
>
> Issue:
>
> iSCSI uses its .xmit_task and .cleanup_task callbacks to get/put MRs from iser fr_pool(fastreg_pool) during the normal IO, at this point if the DEVICE_REMOVAL event is received, iser_cma_handler()->iser_cleanup_handler() it simply releases the available MRs in the fr_pool list (some MRs may have been moved to running task list) and eventually calls ib_dealloc_pd, which ends up hitting kernel panic as some registered MRs are not freed up.
>
> iser_free_fastreg_pool() complains about the registered regions; "pool still has %d regions registered"
>
> Trace:
>
> iser: iser_free_fastreg_pool: pool still has 1 regions registered
> iser: iser_device_try_release: device ffff880508660080 refcount 0 
> iw_cxgb4:c4iw_destroy_cq ib_cq ffff8803f3addc00 
> iw_cxgb4:c4iw_wait_for_reply add wr_waitp ffffc9000dd83a28 
> ------------[ cut here ]------------
> WARNING: CPU: 7 PID: 14790 at drivers/infiniband/core/verbs.c:305 
> ib_dealloc_pd+0x87/0xd0 [ib_core] Modules linked in: rdma_ucm ib_uverbs iw_cxgb4(OE-) autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod bnx2fc fcoe libfcoe 8021q libfc garp stp llc scsi_transport_fc cpufreq_ondemand be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio libcxgb ib_iser rdma_cm ib_cm iw_cm ib_core configfs ipv6 crc_ccitt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi uinput ppdev iTCO_wdt iTCO_vendor_support serio_raw pcspkr parport_pc parport tpm_infineon sg i2c_i801 i2c_core lpc_ich mfd_core e1000e acpi_cpufreq i7core_edac edac_core ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) pata_acpi(E) ata_generic(E) ata_piix(E) floppy(E) cxgb4(OE) ptp(E) pps_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
> CPU: 7 PID: 14790 Comm: rmmod Tainted: G           OE   4.10.0-rc4+ #22
> Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
> Call Trace:
> dump_stack+0x51/0x78
> __warn+0xfd/0x120
> warn_slowpath_null+0x1d/0x20
> ib_dealloc_pd+0x87/0xd0 [ib_core]
> ? ib_unregister_event_handler+0x6d/0x80 [ib_core] ? 
> mutex_lock+0x16/0x40
> iser_device_try_release+0x81/0x120 [ib_iser] ? 
> iser_free_rx_descriptors+0xd3/0xf0 [ib_iser]
> iser_free_ib_conn_res+0x75/0xb0 [ib_iser]
> iser_cleanup_handler+0x41/0x70 [ib_iser]
> iser_cma_handler+0x1c9/0x220 [ib_iser]
> cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
> cma_process_remove+0x127/0x170 [rdma_cm] ? kobject_cleanup+0x82/0x1b0 
> ? kobject_release+0xd/0x10
> cma_remove_one+0x6f/0x90 [rdma_cm]
> ib_unregister_device+0xe7/0x190 [ib_core]
> c4iw_unregister_device+0x79/0x90 [iw_cxgb4] c4iw_remove+0x45/0x6c 
> [iw_cxgb4]
> c4iw_exit_module+0x31/0x75 [iw_cxgb4]
> SyS_delete_module+0x183/0x1d0
> ? syscall_trace_enter+0x154/0x1f0
> ? SyS_munmap+0x6e/0x90
> do_syscall_64+0x6c/0x160
> entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x37d22e8ee7
> RSP: 002b:00007ffedd1877b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> RAX: ffffffffffffffda RBX: 00007ffedd1877c0 RCX: 00000037d22e8ee7
> RDX: 00007ffedd1877af RSI: 0000000000000880 RDI: 00007ffedd1877c0
> RBP: 00007ffedd187810 R08: 00007f0120b48700 R09: 0000000000000100
> R10: 0000000000000011 R11: 0000000000000206 R12: 0000000000000880
> R13: 00007ffedd188735 R14: 0000000000000000 R15: 0000000000000001 ---[ 
> end trace 9bdbdddd5759d7e6 ]---
>
>
> Steps to reproduce:
> 1. Bring up the iser target setup
> 2. Bring up the iser initiator setup
> 3. From DUT(initiator) login to all the Targets and start IOzone traffic on all the mounted luns.
> 4. Now unload iw_cxgb4 module on the iser initiator setup.
>
>
> This is a generic issue, seen with other vendors also.
>
> Could you give me a few pointers on how to debug it further to address this issue?
> I am happy to provide any details further.
>
> Thank you for any help you can provide, -Raju
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" 
> in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux