On Thu, 18 Jun 2009 17:36:11 -0700 Giridhar Malavali <giridhar.malavali@xxxxxxxxxx> wrote: > > On Jun 18, 2009, at 4:27 PM, FUJITA Tomonori wrote: > > > On Thu, 18 Jun 2009 12:05:24 -0700 > > Giridhar Malavali <giridhar.malavali@xxxxxxxxxx> wrote: > > > >>>> /** > >>>> * fc_bsg_job_timeout - handler for when a bsg request timesout > >>>> * @req: request that timed out > >>>> @@ -3471,19 +3479,10 @@ fc_bsg_job_timeout(struct request *req) > >>>> "abort failed with status %d\n", err); > >>>> } > >>>> > >>>> - if (!done) { > >>>> - spin_lock_irqsave(&job->job_lock, flags); > >>>> - job->ref_cnt--; > >>>> - spin_unlock_irqrestore(&job->job_lock, flags); > >>>> - fc_destroy_bsgjob(job); > >>>> - } > >>>> - > >>>> /* the blk_end_sync_io() doesn't check the error */ > >>>> return BLK_EH_HANDLED; > >>> > >>> I think that you need to return a proper value here; see my previous > >>> patch. If a job is done, we need to return BLK_EH_NOT_HANDLED here; > >>> otherwise the block layer calls __blk_complete_request for the job > >>> (and later we call blk_complete_request for the same job). > >>> > >>> The rest looks fine by me. > >> > >> static void blk_rq_timed_out(struct request *req) > >> { > >> struct request_queue *q = req->q; > >> enum blk_eh_timer_return ret; > >> > >> ret = q->rq_timed_out_fn(req); > >> switch (ret) { > >> case BLK_EH_HANDLED: > >> __blk_complete_request(req); > >> break; > >> case BLK_EH_RESET_TIMER: > >> blk_clear_rq_complete(req); > >> blk_add_timer(req); > >> break; > >> case BLK_EH_NOT_HANDLED: > >> /* > >> * LLD handles this for now but in the future > >> * we can send a request msg to abort the command > >> * and we can move more of the generic scsi eh code > >> to > >> * the blk layer. > >> */ > >> break; > >> > >> As per this code, the completion routine will be called only if we > >> return BLK_EH_HANDLED from fc transport. > > > > Correct. > > > > > >> In our case the abort of the > >> timed out request > >> has been completed and handled. That's the reason I returned > >> BLK_EH_HANDLED always. > > > > Hmm, if the 'done' is not zero in fc_bsg_job_timeout, the request > > should not be aborted or completed? > > > > I mean that if (job->state_flags & FC_RQST_STATE_DONE) is true, the > > low level driver handler (such as zfcp_fc_generic_els_handler) is > > performing the request. So fc_bsg_job_timeout() can't abort or > > complete that request. fc_bsg_job_timeout() needs to ignore the > > request. > > I agree, the code has to be changed to > if (done) > return BLK_EH_NOT_HANDLED; > else > return BLK_EH_HANDLED; Yeah, that's exactly what my patches did here. After fixing it, you can add my Acked-by if you like. I guess that there is a race in fc pass thru support if a request is timed out while a low driver is performing a request. I need to look at it later. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html