On 2022/3/8 8:27, Mike Christie wrote:
The patch:
commit 5923d64b7ab6 ("scsi: libiscsi: Drop taskqueuelock")
added an extra task->state because for
commit 6f8830f5bbab ("scsi: libiscsi: add lock around task lists to fix
list corruption regression")
we didn't know why we ended up with cmds on the list and thought it
might have been a bad target sending a response while we were still
sending the cmd. We were never able to get a target to send us a response
early, because it turns out the bug was just a race in libiscsi/
libiscsi_tcp where
1. iscsi_tcp_r2t_rsp queues a r2t to tcp_task->r2tqueue.
2. iscsi_tcp_task_xmit runs iscsi_tcp_get_curr_r2t and sees we have a r2t.
It dequeues it and iscsi_tcp_task_xmit starts to process it.
3. iscsi_tcp_r2t_rsp runs iscsi_requeue_task and puts the task on the
requeue list.
4. iscsi_tcp_task_xmit sends the data for r2t. This is the final chunk of
data, so the cmd is done.
5. target sends the response.
6. On a different CPU from #3, iscsi_complete_task processes the response.
Since there was no common lock for the list, the lists/tasks pointers are
not fully in sync, so could end up with list corruption.
Since it was just a race on our side, this patch removes the extra check
and fixes up the comments.
Reviewed-by: Lee Duncan <lduncan@xxxxxxxx>
Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx>
---
drivers/scsi/libiscsi.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 0a0076144874..5c74ab92725f 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -567,16 +567,19 @@ static bool cleanup_queued_task(struct iscsi_task *task)
struct iscsi_conn *conn = task->conn;
bool early_complete = false;
- /* Bad target might have completed task while it was still running */
+ /*
+ * We might have raced where we handled a R2T early and got a response
+ * but have not yet taken the task off the requeue list, then a TMF or
+ * recovery happened and so we can still see it here.
+ */
if (task->state == ISCSI_TASK_COMPLETED)
early_complete = true;
if (!list_empty(&task->running)) {
list_del_init(&task->running);
/*
- * If it's on a list but still running, this could be from
- * a bad target sending a rsp early, cleanup from a TMF, or
- * session recovery.
+ * If it's on a list but still running this could be cleanup
+ * from a TMF or session recovery.
*/
if (task->state == ISCSI_TASK_RUNNING ||
task->state == ISCSI_TASK_COMPLETED)
@@ -1484,7 +1487,7 @@ static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task,
}
/* regular RX path uses back_lock */
spin_lock(&conn->session->back_lock);
- if (rc && task->state == ISCSI_TASK_RUNNING) {
+ if (rc) {
/*
* get an extra ref that is released next time we access it
* as conn->task above.
Reviewed-by: Wu Bo <wubo40@xxxxxxxxxx>