Kai Makisara wrote:
On Sat, 12 Nov 2005, Mike Christie wrote:
I noticed that these patches still have the same bug that the 2.4 kernel st driver has, namely the holding of the st's SCSI request struct until write_behind_check is called. This behavior is responsible for at least two bugs with tape systems under 2.4 that we've fixed. The first bug is that if you perform a write to a tape device that involves an async write behind request, then attempt to access the device via the sg mechanism without performing any intervening read or ioctl commands on the st device, the sg access will hang. This only happens on SCSI controllers that set the cmd_per_lun value == 1 (eg. mptscsih). In order to replicate this problem you need one application writing to the tape device, then pausing, then something as simple as attempting to do an INQUIRY to the tape while the writer is paused causes the hang. This happens at least with NetBackup, possibly with others as well. The second bug is related to multiple tape usage on the same system. It only happens on x86_64, not i686, but with multiple tapes in use the system eventually attempts to dma map a null pointer resulting in a BUG(). I didn't root cause the dma mapping issue, but I did verify that once the initial bug was fixed, the dma mapping bug went away as well (either because whatever race window existed was reduced to so small that we no longer hit it or the problem was in fact fixed). The patch we used to solve the problem is attached. As a side note, holding on to a command without any upper bound on when it will be released is simply a *bad* idea. Get the information you need from the command and free it.
-- Doug Ledford <dledford@xxxxxxxxxx> http://people.redhat.com/dledford
--- drivers/scsi-orig/st.c 2005-09-14 17:44:16.000000000 -0400 +++ drivers/scsi/st.c 2005-09-15 17:36:52.000000000 -0400 @@ -353,7 +353,14 @@ (STp->buffer)->last_SRpnt = SCpnt->sc_request; DEB( STp->write_pending = 0; ) - complete(SCpnt->request.waiting); + if ((STp->buffer)->writing) { + /* This is a write-behind request, we need to release the + * scsi request struct */ + (STp->buffer)->syscall_result = st_chk_result(STp, SCpnt->sc_request); + SCpnt->sc_request->sr_request.waiting = NULL; + scsi_release_request(SCpnt->sc_request); + } + complete(&(STp->wait)); } @@ -423,10 +430,6 @@ ) /* end DEB */ wait_for_completion(&(STp->wait)); - (STp->buffer)->last_SRpnt->sr_request.waiting = NULL; - - (STp->buffer)->syscall_result = st_chk_result(STp, (STp->buffer)->last_SRpnt); - scsi_release_request((STp->buffer)->last_SRpnt); STbuffer->buffer_bytes -= STbuffer->writing; STps = &(STp->ps[STp->partition]);