On 10/27/2010 01:18 PM, Mike Christie wrote:
On 10/26/2010 06:09 PM, James Bottomley wrote:
On Tue, 2010-10-26 at 18:07 -0500, Mike Christie wrote:
On 10/25/2010 03:53 PM, James Bottomley wrote:
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 1de30eb..484b128 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1156,51 +1156,40 @@ static int scsi_eh_target_reset(struct
Scsi_Host *shost,
struct list_head *work_q,
struct list_head *done_q)
{
- struct scsi_cmnd *scmd, *tgtr_scmd, *next;
- unsigned int id = 0;
- int rtn;
+ LIST_HEAD(tmp_list);
- do {
- tgtr_scmd = NULL;
- list_for_each_entry(scmd, work_q, eh_entry) {
- if (id == scmd_id(scmd)) {
- tgtr_scmd = scmd;
- break;
- }
- }
- if (!tgtr_scmd) {
- /* not one exactly equal; find the next highest */
- list_for_each_entry(scmd, work_q, eh_entry) {
- if (scmd_id(scmd)> id&&
- (!tgtr_scmd ||
- scmd_id(tgtr_scmd)> scmd_id(scmd)))
- tgtr_scmd = scmd;
- }
- }
- if (!tgtr_scmd)
- /* no more commands, that's it */
- break;
+ list_splice(work_q,&tmp_list);
I think I am hitting multiple bugs. One might be with this patch, but
strangely they both hit __list_add bugs in the scsi eh. I think this
needs to be list_splice_init().
I think that's exactly right ... I keep getting tripped up by the _init
requirements for reusing lists (and list_heads).
+ list_for_each_entry_safe(scmd, next,&tmp_list, eh_entry) {
+ if (scmd_id(scmd) != id)
+ continue;
+
+ if ((rtn == SUCCESS || rtn == FAST_IO_FAIL)
+ && (!scsi_device_online(scmd->device) ||
+ rtn == FAST_IO_FAIL || !scsi_eh_tur(scmd)))
+ scsi_eh_finish_cmd(scmd, done_q);
Because if the target resets work and move all the commands to the
done_q here.
+ else
+ /* push back on work queue for further processing */
+ list_move(&scmd->eh_entry, work_q);
+ }
return list_empty(work_q);
}
Then this work_q is showing garbage in it from the splice I think.
Without the list_splice_init I get that trace I cut and pasted in the
other mail when I just force the scsi eh to run the target reset code.
In the test, the target reset succeeds. The scsi_eh_finish_cmd call
above moves all the cmds to the done_q. But the work_q still has junk in
it, and so we return that we need to do more work and eventually we end
up running scsi_eh_offline_sdevs which calls scsi_eh_finish_cmd on a
command we already moved to the done_q.
So does changing to the _init version fix at least the junk done_q
problem?
Yeah.
Hey James,
Your patch with the _init fixed the problem for me. Were you going to
merge the patch or did you want me to test Hilf's version?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html