On Tue, 2010-10-26 at 18:07 -0500, Mike Christie wrote: > On 10/25/2010 03:53 PM, James Bottomley wrote: > > > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > > index 1de30eb..484b128 100644 > > --- a/drivers/scsi/scsi_error.c > > +++ b/drivers/scsi/scsi_error.c > > @@ -1156,51 +1156,40 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost, > > struct list_head *work_q, > > struct list_head *done_q) > > { > > - struct scsi_cmnd *scmd, *tgtr_scmd, *next; > > - unsigned int id = 0; > > - int rtn; > > + LIST_HEAD(tmp_list); > > > > - do { > > - tgtr_scmd = NULL; > > - list_for_each_entry(scmd, work_q, eh_entry) { > > - if (id == scmd_id(scmd)) { > > - tgtr_scmd = scmd; > > - break; > > - } > > - } > > - if (!tgtr_scmd) { > > - /* not one exactly equal; find the next highest */ > > - list_for_each_entry(scmd, work_q, eh_entry) { > > - if (scmd_id(scmd)> id&& > > - (!tgtr_scmd || > > - scmd_id(tgtr_scmd)> scmd_id(scmd))) > > - tgtr_scmd = scmd; > > - } > > - } > > - if (!tgtr_scmd) > > - /* no more commands, that's it */ > > - break; > > + list_splice(work_q,&tmp_list); > > > I think I am hitting multiple bugs. One might be with this patch, but > strangely they both hit __list_add bugs in the scsi eh. I think this > needs to be list_splice_init(). I think that's exactly right ... I keep getting tripped up by the _init requirements for reusing lists (and list_heads). > > + list_for_each_entry_safe(scmd, next,&tmp_list, eh_entry) { > > + if (scmd_id(scmd) != id) > > + continue; > > + > > + if ((rtn == SUCCESS || rtn == FAST_IO_FAIL) > > + && (!scsi_device_online(scmd->device) || > > + rtn == FAST_IO_FAIL || !scsi_eh_tur(scmd))) > > + scsi_eh_finish_cmd(scmd, done_q); > > Because if the target resets work and move all the commands to the > done_q here. > > > > + else > > + /* push back on work queue for further processing */ > > + list_move(&scmd->eh_entry, work_q); > > + } > > > > return list_empty(work_q); > > } > > > Then this work_q is showing garbage in it from the splice I think. > Without the list_splice_init I get that trace I cut and pasted in the > other mail when I just force the scsi eh to run the target reset code. > > In the test, the target reset succeeds. The scsi_eh_finish_cmd call > above moves all the cmds to the done_q. But the work_q still has junk in > it, and so we return that we need to do more work and eventually we end > up running scsi_eh_offline_sdevs which calls scsi_eh_finish_cmd on a > command we already moved to the done_q. So does changing to the _init version fix at least the junk done_q problem? James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html