On Sat, Jan 06, 2007 at 09:30:45AM -0600, James Bottomley wrote: > On Thu, 2007-01-04 at 20:46 -0700, Eric Moore wrote: > > - if (scsi_status == MPI_SCSI_STATUS_BUSY) > > + if (ioc->bus_type != SPI && scsi_status == MPI_SCSI_STATUS_BUSY) > > sc->result = (DID_BUS_BUSY << 16) | scsi_status; > > else > > sc->result = (DID_OK << 16) | scsi_status; > > DID_BUS_BUSY causes an immediate retry, but it does debit the retry > count, so it shouldn't cause "infinite retries" ... if it does, there's > something else wrong here. I wonder if this is the same bug I'm chasing (on ia64 machines, reproduced with both Montecito and Madison). The symptom is a stack overflow caused by this infinite loop: generic_unplug_device __generic_unplug_device scsi_request_fn [1] blk_requeue_request elv_requeue_request __elv_add_request __generic_unplug_device scsi_request_fn [2] blk_requeue_request elv_requeue_request __elv_add_request __generic_unplug_device scsi_request_fn [3] scsi_dispatch_cmd scsi_queue_insert blk_insert_request scsi_request_fn [4] blk_plug_device (stack dump courtesy of incrementing a counter each time through __generic_unplug_device and checking it in blk_plug_device() and __generic_unplug_device) I don't see how it happens; as far as I can tell, by the time we're going to call blk_plug_device() in scsi_request_fn [4], there's no way to unplug the queue again before it gets back to scsi_request_fn [3] ... and from the point where we call scsi_dispatch_cmd(), we immediately either break or test blk_queue_plugged() and exit. There should be no way for it to call blk_requeue_request() again. - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html