Re: [PATCH 0/5]scsi:scsi_debug: Add error injection for single device

"haowenchao (C)" <haowenchao2@xxxxxxxxxx> · Sat, 25 Mar 2023 11:23:46 +0800

On 2023/3/25 1:31, Douglas Gilbert wrote:
On 2023-03-23 23:42, haowenchao (C) wrote:
On 2023/3/24 1:24, Douglas Gilbert wrote:
On 2023-03-23 12:25, John Garry wrote:
On 23/03/2023 13:13, haowenchao (C) wrote:
On 2023/3/23 20:40, John Garry wrote:
On 23/03/2023 11:55, Wenchao Hao wrote:
The original error injection mechanism was based on scsi_host which
could not inject fault for a single SCSI device.

This patchset provides the ability to inject errors for a single
SCSI device. Now we supports inject timeout errors, queuecommand
errors, and hostbyte, driverbyte, statusbyte, and sense data for
specific SCSI Command.

There is already a basic mechanism to generate errors - like timeouts - on "nth" command. Can you say why you want this new interface? What special scenarios are you trying to test/validate (which could not be achieved based on the current mechanism)?

I am testing a new error handle policy which is based on single scsi_device
without set host to RECOVERY. So I need a method to generate errors for
single SCSI devices.

While we can not generate errors for single device with current mechanism
because it is designed for host-wide error generation.

With this series we would have 2x methods to inject errors, which is less than ideal, and they seem to possibly conflict as well, e.g. I set timeout for nth command via current interface and then use the new interface to set timeout for some other cadence. What behavior to expect ...?

I did not take this issue in consideration. I now assume the users would
not use these 2 methods at same time.

What's more, I don not know where to write the usage of this newly added
interface, maybe we can explain these in doc?

sysfs entries are described in Documentation/ABI, but please don't add elaborate programming interfaces in sysfs files (like in these patches) - a sysfs file should be just for reading or writing a single value

Hi,
Maybe this link might help for scsi_debug documentation:
     https://doug-gilbert.github.io/scsi_debug.html

And rather than sysfs for complicated, per (pseudo_ device
settings, perhaps we could think about a SCSI mechanism like
the "Unit Attention" mode page [0x0] which is vendor specific
and used by Seagate and WDC for this sort of thing.
A framework is already in the scsi_debug driver to change
some mode page settings:

# sdparm /dev/sg0
     /dev/sg0: Linux     scsi_debug        0191
Read write error recovery mode page:
   AWRE          1  [cha: n, def:  1]
   ARRE          1  [cha: n, def:  1]
   PER           0  [cha: n, def:  0]
Caching (SBC) mode page:
   WCE           1  [cha: y, def:  1]
   RCD           0  [cha: n, def:  0]
Control mode page:
   SWP           0  [cha: n, def:  0]
Informational exceptions control mode page:
   EWASC         0  [cha: n, def:  0]
   DEXCPT        1  [cha: n, def:  1]
   MRIE          0  [cha: y, def:  0]

As can be seen WCE and MRIE are changeable, so

# sdparm --clear=WCE /dev/sg0
# sdparm --get=WCE /dev/sg0
     /dev/sg0: Linux     scsi_debug        0191
WCE           0  [cha: y, def:  1]

Doug Gilbert

Do you mean define scsi_debug's own format of mode page0(Vendor specific)
which contains these error injection info, and set/get these parameters
via sdparm?
If so, do we need to modify the sdparm code for these changes?

Not a problem.

I want to add more injections in scsi_debug to test the SCSI middle layer,
for example, control return SUCCESS in scsi_debug_abort() or
scsi_debug_device_reset().

These injections are more oriented to developers to trigger and observe
the error handler of SCSI middle layer.

We can extend other error injections conveniently via my interface,
for example, add a new error code to add a new injection to control the
return value of scsi_debug_abort().

If it's not recommended to add this interface in sysfs, what about proc? Like
/proc/scsi/scsi, we can write "scsi remove-single-device h:c:t:l" to manage
device.

In the past, procfs has been used for this sort of thing
but the powers that be want to phase that usage out.

debugfs, even though it is usually mounted under sysfs at
/sys/kernel/debug , does not seem to have the "one value
per variable" restriction. So debugfs or configfs
( /sys/kernel/config ) might be better candidates.
See 'df -ahT' .

Define scsi_debug's own format of mode page0 seems too complex and we have to
change the sdparm code. In order to make a better scalability, I prefer to add
these interface via debugfs. Actually, I design the format of the "error interface"
described in patch1 by referring to interfaces of ftrace.

The new interfaces based on debugfs would look like following:
/sys/kernel/debug/scsi_debug
	|
	+-- 0:0:0:1
	|	|
	|	\-- error
	|
	\-- 0:0:0:2
		|
		\-- error

I'm not saying that I am a huge fan of the current inject mechanism, but at the very least you need to provide more justification for this series.

The first patch add an sysfs interface to add and inquiry single
device's error injection info; the second patch defined how to remove
an injection which has been added. The following 3 patches use the
injection info and generate the related error type.

Wenchao Hao (5):
   scsi:scsi_debug: Add sysfs interface to manage scsi devices' error
     injection
   scsi:scsi_debug: Define grammar to remove added error injection
   scsi:scsi_debug: timeout command if the error is injected
   scsi:scsi_debug: Return failed value if the error is injected
   scsi:scsi_debug: set command's result and sense data if the error is
     injected

  drivers/scsi/scsi_debug.c | 296 ++++++++++++++++++++++++++++++++++++++
  1 file changed, 296 insertions(+)