Re: [PATCH 5/10] convert st to use scsi_execute_async

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kai Makisara wrote:
On Sun, 13 Nov 2005, Doug Ledford wrote:


Kai Makisara wrote:

On Sat, 12 Nov 2005, Mike Christie wrote:

I noticed that these patches still have the same bug that the 2.4 kernel st
driver has, namely the holding of the st's SCSI request struct until
write_behind_check is called.  This behavior is responsible for at least two
bugs with tape systems under 2.4 that we've fixed.  The first bug is that if
you perform a write to a tape device that involves an async write behind
request, then attempt to access the device via the sg mechanism without
performing any intervening read or ioctl commands on the st device, the sg
access will hang.  This only happens on SCSI controllers that set the
cmd_per_lun value == 1 (eg. mptscsih).  In order to replicate this problem you
need one application writing to the tape device, then pausing, then something
as simple as attempting to do an INQUIRY to the tape while the writer is
paused causes the hang.  This happens at least with NetBackup, possibly with
others as well.  The second bug is related to multiple tape usage on the same
system.  It only happens on x86_64, not i686, but with multiple tapes in use
the system eventually attempts to dma map a null pointer resulting in a BUG().
I didn't root cause the dma mapping issue, but I did verify that once the
initial bug was fixed, the dma mapping bug went away as well (either because
whatever race window existed was reduced to so small that we no longer hit it
or the problem was in fact fixed).  The patch we used to solve the problem is
attached.  As a side note, holding on to a command without any upper bound on
when it will be released is simply a *bad* idea.  Get the information you need
from the command and free it.



You are complaining about one feature and reporting a possible bug without much information.

I was going to put in a simple reproducer, but the reproducer was based on code from someone else and I don't know the redistribution status of it since it wasn't my bug report.

It seems that you (RedHat) have been sitting on this report for a long time and have shipped the fix for your own clients only. Not very nice!

Not true at all. The fix hasn't even shipped yet except in a HOTFIX kernel, it will be released in our next update.

Originally there was a reason why the SCSI request struct was held until write_behind_check. The reason was to execute minimum amount of code in interrupt context. For a very long time, scsi_done has been called outside interrupt and this reason is not valid any more.

For 2.4 that's not entirely true. The old error handler drivers in 2.4 still do their work in interrupt context, and the driver I referenced, mptscsih, is an old error handler driver.

The reason why this has not changed is that nobody has asked for it.

I don't see any reason why the change you suggest should not be done. Does anyone else? If nobody complains, I will do the change for 2.6.16.

The dma bug you are talking about is interesting but I don't have any idea why it is happening. Releasing the SCSI request earlier should not have anything to do with that.

I never isolated it down to a root cause. I suspect the more NUMA like nature of x86_64 compared to i686 SMP has something to do with it.

Mixing sg access with ULD operation is almost always a bad idea.

Maybe. If you are talking about doing an sg write operation while also doing st writes, then yeah, that's bad. But there's no inherent reason that something like INQUIRY or some status command can't be intermixed into the overall command stream via sg, and I would consider it a bug in the core scsi layer if it didn't handle that properly.

Thanks for the report and fix.

No problem.


--
Doug Ledford <dledford@xxxxxxxxxx>
http://people.redhat.com/dledford

-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux