On Fri, 2010-09-24 at 11:33 -0700, Mike Anderson wrote: > Jens Axboe <axboe@xxxxxxxxx> wrote: > > On 2010-09-23 23:46, Nicholas A. Bellinger wrote: > > > On Sat, 2010-09-18 at 12:58 -0700, Nicholas A. Bellinger wrote: > > >> On Fri, 2010-09-17 at 21:45 -0500, Mike Christie wrote: <SNIP> > > > Greetings Mike and Co, > > > > > > I was doing some followup on these items for a v3 series and started > > > with a patch following mnc's recommendations for dropping the > > > scsi_error.c codes depending upon struct scsi_cmnd->serial_number: > > > > > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > > > index 1de30eb..f35c127 100644 > > > --- a/drivers/scsi/scsi_error.c > > > +++ b/drivers/scsi/scsi_error.c > > > @@ -644,11 +644,7 @@ static int __scsi_try_to_abort_cmd(struct scsi_cmnd *scmd) > > > */ > > > static int scsi_try_to_abort_cmd(struct scsi_cmnd *scmd) > > > { > > > - /* > > > - * scsi_done was called just after the command timed out and before > > > - * we had a chance to process it. (db) > > > - */ > > > - if (scmd->serial_number == 0) > > > + if (test_bit(REQ_ATOM_COMPLETE, &scmd->request->atomic_flags)) > > > return SUCCESS; > > > return __scsi_try_to_abort_cmd(scmd); > > > } > > > > > > and while building I noticed that the simple single enum > > > REQ_ATOM_COMPLETE=0 is > > > > > currently located in block/blk.h and along with blk_mark_rq_complete() > > > and blk_clear_rq_complete() for setting this bit within struct > > > request->atomic_flags. > > > > > > jens, hch, tejun, and co, do you guys have a preference how this > > > should be handled so that scsi_try_to_abort_cmd() can use proper > > > atomic struct request bits here and we can get rid of this (racy..?) > > > method of using struct scsi_cmnd->serial_number for anything wrt to > > > per struct scsi_cmnd context timeout handling. > > > > Just add a > > > > static inline int blk_test_rq_complete(struct request *rq) > > { > > return test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags); > > } > > > > in block/blk.h > > > > I need this too for some lockless completion patches I am playing with. > Greetings Mike, > In reviewing the current code paths wasn't the serial_number also used to > avoid calling __scsi_try_to_abort_cmd for START_UNIT case also. Not sure on this item, but checking on this now.. > > If we skip __scsi_try_to_abort_cmd when REQ_ATOM_COMPLETE is set it would > be correct for the scsi_decide_disposition cases but it would appear this > would stop __scsi_try_to_abort_cmd from being called in the time out > case as REQ_ATOM_COMPLETE is set prior to calling blk_rq_timed_out. Hmmmmmmm.. > > 1.) Request timed out path to scsi_eh_scmd_add. > > blk_rq_timed_out_timer > ... > if (blk_mark_rq_complete(rq)) > continue; > blk_rq_timed_out > q->rq_timed_out_fn "scsi_times_out" > scsi_times_out > scsi_eh_scmd_add > > 2.) Request completion path to scsi_eh_scmd_add based on > scsi_decide_disposition disposition set to FAILED (actually > scsi_eh_scmd_add called from the default case with the disposition of > FAILED appearing to be the only disposition returned that would hit this > case vs. the other three handled cases). This should not be a common path > outside of handling allow_restart. > > blk_complete_request > ... > if (!blk_mark_rq_complete(req)) > __blk_complete_request > ... > raise_softirq_irqoff(BLOCK_SOFTIRQ); > blk_done_softirq > rq->q->softirq_done_fn "scsi_softirq_done" > scsi_softirq_done > scsi_eh_scmd_add > > 3.) Call path to scsi_try_to_abort_cmd. > scsi_error_handler > scsi_unjam_host > scsi_eh_abort_cmds > scsi_try_to_abort_cmd > Thanks, > Ok, what I think is being said here is that the usage of blk_test_rq_complete() in scsi_try_to_abort_cmd() completly breaks __scsi_try_to_abort_cmd from being called from within the struct request timeout handler, right..? Not being exquistiely fimilar with the scsi_error.c generic LLD logic w/ struct request timeouts code, I assume this means we need to revisit scsi_unjam_host() once again to take into account the new blk_test_rq_complete() calls from the normal fast path block softirq in order to be able to handle the individual struct request timeout cases for outstanding struct scsi_cmnd when they individual timer contexts fire. Perhaps we should be setting another bit in struct request->atomic_flags to signal the REQ_BLKSOFTIRQ_COMPLETE to let scsi_unjam_host() know what is going on..? Thanks for your comments Mike! --nab -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html