Alan Stern wrote: > On Wed, 13 Aug 2008, Boaz Harrosh wrote: > >> But now that I stared at the code harder. Current code is a complete bug. >> the: "this_count = scsi_bufflen()" Has nothing to do with anything. >> Maybe it was meant to be: "this_count = scsi_bufflen() - good_bytes" >> That is the count left over after the first complete. But after we >> have completed good_bytes, the second complete is never scsi_bufflen(). >> (It used to work because most of the times it is bigger then the reminder >> but then we can just use: "this_count = ~0") >> >> Also what you did is not enough. What if the error is one of the known cases >> above, where the "complete and return" is inside the case. They will hang also. >> Here is what I think should be: >> >> --- >> git diff --stat -p drivers/scsi/ >> drivers/scsi/scsi_lib.c | 7 +++---- >> 1 files changed, 3 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >> index ff5d56b..36995e5 100644 >> --- a/drivers/scsi/scsi_lib.c >> +++ b/drivers/scsi/scsi_lib.c >> @@ -852,7 +852,7 @@ static void scsi_end_bidi_request(struct scsi_cmnd *cmd) >> void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) >> { >> int result = cmd->result; >> - int this_count = scsi_bufflen(cmd); >> + int this_count; >> struct request_queue *q = cmd->device->request_queue; >> struct request *req = cmd->request; >> int error = 0; >> @@ -909,9 +909,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) >> if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL) >> return; >> >> - /* good_bytes = 0, or (inclusive) there were leftovers and >> - * result = 0, so scsi_end_request couldn't retry. >> - */ >> + this_count = blk_rq_bytes(req); >> + >> if (sense_valid && !sense_deferred) { >> switch (sshdr.sense_key) { >> case UNIT_ATTENTION: > > Is this really right? Consider the subcases that follow. For example, > in the UNIT ATTENTION case we have > > scsi_end_request(cmd, -EIO, this_count, 1); > > With this_count equal to blk_rq_bytes(req), there will be no leftovers > and so nothing will be requeued. Shouldn't it really be > > scsi_end_request(cmd, -EIO, 0, 1); > > The same holds for all the other places scsi_end_request() is called > with requeue equal to 1, including the final call (if result == 0). > This suggests that this_count should be eliminated entirely and each > of those calls be replaced by either > > goto retry; > or > goto fail; > > together with: > > retry: > scsi_end_request(cmd, -EIO, 0, 1); > return; > fail: > scsi_end_request(cmd, -EIO, blk_rq_bytes(req), 0); > return; > OK Yes, you are absolutely right. Please send a proposal and we'll stare at it for a while. One thing I don't like is the now blk_end_request(req, -EIO, bytes=0) inside scsi_end_request(). What's the point of calling that with 0 bytes? Maybe fix that too. > Alan Stern > > PS: In my copy of the code, scsi_end_request() doesn't use > blk_rq_bytes(). Has this been updated in the SCSI development tree? > No it does not. I was just commenting on the new code. If you fix up scsi_end_request() you should fix this too. Thanks for doing this. This is a long outstanding nagging bug. It was encountered before. Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html