These patches (against a 2.6.9 kernel, Redhat AS 4) for st add a reference count, and fix a few bugs I hit during testing (administrative removes and cable pulls with IO in progress). The primary change is adding a kref to the Scsi_Tape structure, to avoid an oops when the tape drive is removed while open. This includes scsi_tape_get/put routines and scsi_tape_release, and changes to st_open and st_release. This is all based on the SCSI disk & CD reference counting code. I hit some bugs while I was testing this code. Fixes for these are also included in the patch: - When an IO gets a "dead device" error, we need to check the rq_status to determine if the request actually completed (scsi_wait_req does this)... else there's no notification to the caller that the IO failed. There are two changes for this, one in st_do_scsi and one in write_behind_check. - When an async IO gets a "dead device" error, we're notified of the completion by the block layer (end_that_request_last), so last_SRpnt doesn't get set (as it would if the complete were done by st_sleep_done). So I changed how last_SRpnt is used to get around this... it's now set only for async IOs, just before the IO is issued (and nulled out when the IO comes back). I also hit the following problem when I briefly tested on a SUSE kernel (it wasn't in the Redhat kernel I did most of my testing with, but it is in 2.6.12): - In these kernels, we no longer get a "complete" from end_that_request_last when a device goes away and an IO is rejected. I therefore added to st_do_scsi (but it's commented in this patch) setting SRpnt->sr_request->end_io to blk_end_sync_rq, which will do the complete. Since blk_end_sync_rq nulls out rq->waiting, I changed st_do_scsi to wait on a local variable, and added a check for null in st_sleep_done (just in case). And just one more thing: - I noticed that in 2.6.12, there is what appears to be a bogus use of last_SRpnt in st_chk_result (this doesn't exist in the kernels I tested with). This would cause problems if these patches are ported to 2.6.12, because of how I changed the use of last_SRpnt. Either my code has to change, or st_chk_result shouldn't use last_SRpnt. Anyway... as I said, these are against a 2.6.9 kernel (Redhat AS 4). If you're interested, I could port these changes to 2.6.12... can't promise I'll be able to test with that kernel, though. Nate Dailey Stratus Technologies Signed-off-by: Nate Dailey <nate.dailey@xxxxxxxxxxx>
Attachment:
st.c.patch
Description: st.c.patch
Attachment:
st.h.patch
Description: st.h.patch