Boaz Harrosh wrote: > Alan Stern wrote: >> On Wed, 13 Aug 2008, Boaz Harrosh wrote: >> >>> But now that I stared at the code harder. Current code is a complete bug. >>> the: "this_count = scsi_bufflen()" Has nothing to do with anything. >>> Maybe it was meant to be: "this_count = scsi_bufflen() - good_bytes" >>> That is the count left over after the first complete. But after we >>> have completed good_bytes, the second complete is never scsi_bufflen(). >>> (It used to work because most of the times it is bigger then the reminder >>> but then we can just use: "this_count = ~0") >>> >>> Also what you did is not enough. What if the error is one of the known cases >>> above, where the "complete and return" is inside the case. They will hang also. >>> Here is what I think should be: >>> >>> --- >>> git diff --stat -p drivers/scsi/ >>> drivers/scsi/scsi_lib.c | 7 +++---- >>> 1 files changed, 3 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >>> index ff5d56b..36995e5 100644 >>> --- a/drivers/scsi/scsi_lib.c >>> +++ b/drivers/scsi/scsi_lib.c >>> @@ -852,7 +852,7 @@ static void scsi_end_bidi_request(struct scsi_cmnd *cmd) >>> void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) >>> { >>> int result = cmd->result; >>> - int this_count = scsi_bufflen(cmd); >>> + int this_count; >>> struct request_queue *q = cmd->device->request_queue; >>> struct request *req = cmd->request; >>> int error = 0; >>> @@ -909,9 +909,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) >>> if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL) >>> return; >>> >>> - /* good_bytes = 0, or (inclusive) there were leftovers and >>> - * result = 0, so scsi_end_request couldn't retry. >>> - */ >>> + this_count = blk_rq_bytes(req); >>> + >>> if (sense_valid && !sense_deferred) { >>> switch (sshdr.sense_key) { >>> case UNIT_ATTENTION: >> Is this really right? Consider the subcases that follow. For example, >> in the UNIT ATTENTION case we have >> >> scsi_end_request(cmd, -EIO, this_count, 1); >> >> With this_count equal to blk_rq_bytes(req), there will be no leftovers >> and so nothing will be requeued. Shouldn't it really be >> >> scsi_end_request(cmd, -EIO, 0, 1); >> >> The same holds for all the other places scsi_end_request() is called >> with requeue equal to 1, including the final call (if result == 0). >> This suggests that this_count should be eliminated entirely and each >> of those calls be replaced by either >> >> goto retry; >> or >> goto fail; >> >> together with: >> >> retry: >> scsi_end_request(cmd, -EIO, 0, 1); >> return; >> fail: >> scsi_end_request(cmd, -EIO, blk_rq_bytes(req), 0); >> return; >> > > OK Yes, you are absolutely right. Please send a proposal and we'll > stare at it for a while. > > One thing I don't like is the now blk_end_request(req, -EIO, bytes=0) > inside scsi_end_request(). What's the point of calling that with 0 bytes? > Maybe fix that too. > >> Alan Stern >> >> PS: In my copy of the code, scsi_end_request() doesn't use >> blk_rq_bytes(). Has this been updated in the SCSI development tree? >> > > No it does not. I was just commenting on the new code. If you fix up > scsi_end_request() you should fix this too. > > Thanks for doing this. This is a long outstanding nagging bug. It was > encountered before. > > Boaz > One more thing I do not understand is: inside scsi_io_completion() in some places it will requeue with: "scsi_end_request(cmd, -EIO, this_count, 1)" and at other places it will directly requeue with: "scsi_requeue_command(q, cmd);" The difference being in the later case not calling scsi_next_command(cmd); Do you understand why? Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html