Re: [PATCH v2 0/4] scsi: st: scsi_error: More reset patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On 13. Dec 2024, at 19.32, John Meneghini <jmeneghi@xxxxxxxxxx> wrote:
> 
> On 12/13/24 08:09, "Kai Mäkisara (Kolumbus)" wrote:
>>> On 12. Dec 2024, at 20.27, Kai Mäkisara (Kolumbus) <kai.makisara@xxxxxxxxxxx> wrote:
>>> 
>>> While doing some detective work, I found a serious problem. So, please hold these patches again.
>>> More about the reason below.
>> ...
>>> The problem is that no driver options for the device can be set before something has
>>> been done to clear the blocking. For instance, the stinit tool is a recommended method
>>> to set the options based on a configuration file, but it fails.
> 
> And "the blocking" is because pos_unknown is set?
> 
>>> Note that this problem has existed since commit 9604eea5bd3ae1fa3c098294f4fc29ad687141ea
>>> (for version 6.6) that added recognition of POR UA as an additional method to detect
>>> resets. Nobody seems to have noticed this problem in the "real world". (Using
>>> was_reset was not problematic because it caught only resets initiated by the midlevel.)
> 
> Just to be clear. People in the real world did notice this problem. We have multiple customers who have reported "regressions" in the st driver, all of whom starting using a version of our distribution which had commit 9604eea5bd3a. The changes for
> 9604eea5bd3a (scsi: st: Add third party poweron reset handling) were necessary to fix a real customer reported problem, but there were a number of regressions introduced by this change and it looks like we haven't gotten to the bottom of these regressions. Basically, we had so many customer complaints about this that we reverted commit 9604eea5bd3a in rhel-8.

This sounds puzzling. The patch 9604eea5bd3a has been signed off by you. Now you say that 
there were a number of regressions, so that you have reverted the commit in rhel-8. Yet, there
have been no reports of regressions in linux-scsi. Or have I missed something?


I have made some experiments with st.c from v6.4 (before the commit) and v6.7 after the
commit. My (slightly tuned) scsi_debug was started with option 'scsi_level=6'. The
test used the stinit tool that can be used to set st parameters after a drive has been
detected (using, e.g., udev). (And I think  that any decently configured Linux system
with tape drives should set the proper parameters for the drives.)

The test uses modprobe to load scsi_debug (and this loads also st). After that
the tools mentioned above were tried:

st.c from v6.4:
- stinit succeeds
- 'dd if=/dev/nst0 of=/dev/null bs=10240 count=10' succeeds

st.c from v6.7:
- stinit fails
- dd fails


So, there is are clear regressions caused by commit 9604eea5bd3ae1fa3c098294f4fc29ad687141ea
and this must be fixed. One method is, of course, to revert the commit. Another alternative is to do
something to solve the problems created by the commit.

Modifying st to accept what stinit uses even is pos_unknown is true fixes the stinit problem,
but dd still fails. Not setting pos_unknown after the initial POR UA fixes both problems.






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux