On Wed, 2010-10-27 at 12:20 -0700, Mike Anderson wrote: > Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote: > > On Wed, 2010-10-27 at 09:27 -0500, James Bottomley wrote: > > > On Wed, 2010-10-27 at 09:53 +0200, Andi Kleen wrote: > > > > > This sounds like a pretty reasonable compromise that I think is slightly > > > > > less risky for the LLDs with the ghosts and cob-webs hanging off of > > > > > them. > > > > > > > > They won't get tested either next release cycle. Essentially > > > > near nobody uses them. > > > > > > > > > > > > > > What do you think..? > > > > > > > > Standard linux practice is to simply push the locks down. That's a pretty > > > > mechanical operation and shouldn't be too risky > > > > > > > > With some luck you could even do it with coccinelle. > > > > > > Precisely ... if we can do the push down now as a mechanical > > > transformation we can put it in the current merge window as a low risk > > > API change. > > > > I disagree that touching every single legacy LLD's SHT->queuecommand() > > and failure paths in that code is a low rist change. > > > > > This gives us optimal exposure to the rc sequence to sort > > > out any problems that arise (or drivers that got missed) with the lowest > > > risk of such problems actually arising. > > > > Yes, > > > > > Given the corner cases and the > > > late arrival of fixes, the serial number changes are just too risky for > > > the current merge window. > > > > I think with andmike's testing and ACKs for the necessary scsi_error.c > > changes this would be an acceptable risk. > > > > Adding SCSI_EH_SOFTIRQ_DONE in scsi_softirq_done is not going to provide > value in scsi_try_to_abort_cmd. scsi_softirq_done calls scsi_eh_scmd_add > without the SCSI_EH_CANCEL_CMD flag set which will stop > scsi_try_to_abort_cmd from being called. > > Removing the serial_number check in scsi_try_to_abort_cmd and not > replacing it may be the correct action as we should be relying on the > block complete checking. That said what James has indicated about > splitting the serial number change out seems like the lower risk approach > at this time. > Hmm, that is unfortuate.. So in this case it would make sense to drop the explict LLD usage of scsi_cmd_get_serial(), and re-include this into scsi_dispatch_cmd() for all LLDs and have to deal with a per scsi_host atomic_t serial_number counter. Anyways, I will go ahead an respin another series to follow this logic shortly. The other question that was mentioned in my email yesterday would be if the clearing of a non atomic_t cmd->serial_number from scsi_softirq_done() -> scsi_try_to_abort_cmd() is safe to begin with..? Does this need to be converted to an atomic_t as well to present a subtle race outside of any of the host_lock-less series of patches..? --nab -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html