On Thu, 2008-02-28 at 22:56 +0800, Keith Hopkins wrote: > On 02/20/2008 02:44 AM, Darrick J. Wong wrote: > > If we send an ABORT_TASK ascb that doesn't return within the timeout period, > > we should not free that ascb because the sequencer is still holding onto it. > > Hopefully it will fix what James Bottomley describes below: > > > > On Tue, Feb 19, 2008 at 10:22:20AM -0600, James Bottomley wrote: > > > >> Unfortunately, there's a bug in TMF timeout handling in the driver, it > >> leaves the sequencer entry pending, but frees the ascb. If the > >> sequencer ever picks this up it will get very confused, as it does a > >> while down in the trace: > >> > >>> aic94xx: BUG:sequencer:dl:no ascb?! > >>> aic94xx: BUG:sequencer:dl:no ascb?! > >> That's where the sequencer adds an ascb to the done list that we've > >> already freed. From this point on confusion reigns and the error > >> handler eventually offlines the device. > >> > >> I'll see if I can come up with patches to fix this ... or at least > >> mitigate the problems it causes. > > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > --- > > > > drivers/scsi/aic94xx/aic94xx_tmf.c | 7 ++++++- > > 1 files changed, 6 insertions(+), 1 deletions(-) > > > > diff --git a/drivers/scsi/aic94xx/aic94xx_tmf.c b/drivers/scsi/aic94xx/aic94xx_tmf.c > > index b52124f..4b24bd3 100644 > > --- a/drivers/scsi/aic94xx/aic94xx_tmf.c > > +++ b/drivers/scsi/aic94xx/aic94xx_tmf.c > > @@ -463,7 +463,7 @@ int asd_abort_task(struct sas_task *task) > > AIC94XX_SCB_TIMEOUT); > > spin_lock_irqsave(&task->task_state_lock, flags); > > if (leftover < 1) > > - res = TMF_RESP_FUNC_FAILED; > > + goto out_not_reported; > > if (task->task_state_flags & SAS_TASK_STATE_DONE) > > res = TMF_RESP_FUNC_COMPLETE; > > spin_unlock_irqrestore(&task->task_state_lock, flags); > > @@ -487,6 +487,11 @@ out: > > asd_ascb_free(ascb); > > ASD_DPRINTK("task 0x%p aborted, res: 0x%x\n", task, res); > > return res; > > + > > +out_not_reported: > > + spin_unlock_irqrestore(&task->task_state_lock, flags); > > + ASD_DPRINTK("task 0x%p aborted? but not reported.\n", task); > > + return res; > > } > > > > /** > > - > > Hi Darrick, > > Is this the only patch for ascb sequencer use after free problems, or are you still looking into that? Sorry, I forgot to cc you. Actually this one is the full one: http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-rc-fixes-2.6.git;a=commit;h=e2396f1e4ecd438a15fa653a028b93e95013caa3 Unfortunately, there are another five patches in that git tree that you'll also need to see if we can get aic94xx working on your box. If you're willing, could you use 2.6.25-rc3 as the base kernel and just apply http://www.kernel.org/pub/linux/kernel/people/jejb/scsi-rc-fixes-2.6.diff On top of it? That should give you a kernel patched with all of the pending aic94xx and libsas fixes. Thanks, James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html