On Tue, 2014-03-04 at 17:17 +0200, Sagi Grimberg wrote: > On 3/4/2014 2:00 AM, Nicholas A. Bellinger wrote: > > From: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> > > > > Hi Or & Sagi, > > > > This series addresses a number of active I/O shutdown related issues > > in iser-target code that have come up recently during stress testing. > > > > Note there is still a seperate iser-target network portal shutdown > > bug being tracked down, but this series addresses all existing issues > > related to active I/O session shutdown. > > > > The patch breakdown looks like: > > > > Patch #1 fixes a long-standing bug where TPGs in shutdown incorrectly > > could be referenced by new login attempts. > > > > Patch #2 converts list_del -> list_del_init for iscsi_cmd->i_conn_node > > so that list_empty works correctly. > > > > Patch #3 addresses isert_conn->state related bugs resulting in hung > > shutdown, and splits isert_free_conn() into seperate code that is > > called earlier during shutdown to ensure that all outstanding I/O > > has completed. > > > > Patch #4 fixes incorrect accounting of ->post_send_buf_count during > > active I/O shutdown with outstanding RDMA WRITE + RDMA READ work > > requests. > > > > Patch #5 addresses a bug related to active I/O shutdown with > > outstanding FRMR work requests. Note this patch is specific to > > v3.12+ code. > > > > Patch #6 addresses bugs related to active I/O shutdown with > > outstanding completion interrupt coalescing batches. Note this patch > > is specific to v3.13+ code. > > > > Please review. > > Hey Nic, > > So besides a minor comment, you have my Ack on this set. > Thanks! > More on cleanup flow. isert_cma_handler does not handle > RDMA_CM_EVENT_TIMEWAIT_EXIT. > To be more specific, according to IB spec, when initiating disconnect > (rdma_disconnect/ib_send_cm_dreq), > one should not destroy a used qp until getting TIMEWAIT_EXIT CM event. > We are working on this in iSER initiator. > It might lead to "stale connection" CM rejects on future connections > (SRP also does not do that). > <nod>, I noticed that as well during recent debugging. However, AFAICT the RDMA_CM_EVENT_TIMEWAIT_EVENT doesn't (always) occur on the target side after a RDMA_CM_EVENT_DISCONNECTED, and thus far I've not been able to ascertain what's different about the shutdown sequence that would make this happen, or not happen.. Any ideas..? --nab -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html