Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Sat, 17 Sep 2005, Jan Dittmer wrote: > > > > Maybe the wakeup occurred before ap->ops was set correctly, or after it > > > was unset. Jan, at what point did the oops happen? Was it right after > > > the device was detected, during removal, or some other time? > > > > > > Can you put in some debugging printk's to see what values are in ap, > > > ap->ops, and ap->ops->eng_timeout? > > > > ap->ops is 0, on dereferencing I get a backtrace. ap has a valid pointer > > (-573296044 whatever that maps to). > > Hmm... I imagine that when the error handler is first starting up, > ->host_busy is equal to ->host_failed because both are 0. So that really > is not the appropriate condition to wait for. A better approach would be > to have an atomic_t variable recording the number of pending invocations. > > On the whole, I wonder if using kthread_stop here is such a good idea. > The old mechanism for stopping worked well... > Since scsi_eh_wakeup can only be called on a completion or timeout of an IO you cannot get a comparison when both are 0 (unless we have a bug somewhere). If the increment of host_failed, increment of host_busy, decrement of host_busy, and the comparison of host_busy to host_failed is all under the host_lock why would the atomic_t be better. -andmike -- Michael Anderson andmike@xxxxxxxxxx - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html