Hi Daniel.
[resent in plain text format, sorry...]
On 4/09/24 01:58, Daniel Palmer wrote:
I have no idea if this fix is appropriate, the code in this driver
makes my brain hurt just looking at it, but sometimes when getting
scsi_pointer from cmd cmd itself is actually null and we get a weird
garbage pointer for scsi_pointer.
I am not sure I read that code right (you are correct in saying all that
state machine code inside the interrupt handler makes for headaches, but
that's a different matter).
To me, this looks like the driver takes an interrupt without a connected
command (that is why cmd ends up NULL). The interrupt turns out to be a
selection interrupt, and the attempt to read the current bus phase from
the command private data fails.
The driver does use scsi_pointer elsewhere without ever checking it's
valid, so it would seem this case is not meant to happen. On the other
hand, we do not expect to have a connected command while selecting, so
this case is pretty much _guaranteed_ to happen!
It would appear that when Bart worked on this driver to move
scsi_pointer to command private data (commit
dbb2da557a6a87c88bbb4b1fef037091b57f701b in my tree), he overlooked the
fact that 'cmd' is reloaded inside the interrupt handler from host
queues, in response to interrupt conditions or phases. The copy of
scsi_pointer initially obtained (from the connected command, without
checking that there is in fact a command connected) is never reloaded
though. Even if there had been a connected command, scsi_pointer would
be stale at the point where the phase information is needed to construct
the IDENTIFY message.
The old code used phase information from the just reloaded selecting
command, so reloading scsi_pointer at this time (as you do) is the
correct behaviour.
I am not certain that the test for NULL cmd at the start of the
interrupt handler is actually required. Any part of the driver that
changes cmd and makes use of scsi_pointer must reload scsi_pointer
afterwards. The selection interrupt part seems the only part using
scsi_pointer, so your fix is sufficient.
Please respin and set the appropriate Fixes: tag.
Cheers,
Michael
When this is accessed later on bad things happen. For my machine
this happens when the SCSI bus is initially scanned.
With this "fix" SCSI on my MVME147 is happy again.
[ 84.330000] wd33c93-0: chip=WD33c93B/13 no_sync=0xff no_dma=0
[ 84.330000] debug_flags=0x00
[ 84.350000] setup_args=
<snip>
[ 84.490000]
[ 84.510000] Version 1.26++ - 10/Feb/2007
[ 84.520000] scsi host0: MVME147 built-in SCSI
[ 85.480000] sending SDTR 0103015e00
[ 85.480000] 01
[ 85.490000] 03
[ 85.500000] 01
[ 85.510000] 00
[ 85.520000] 00
[ 85.520000] sync_xfer=30
[ 85.530000] scsi 0:0:5:0: Direct-Access BlueSCSI HARDDRIVE 2.0 PQ: 0 ANSI: 2
[ 85.820000] st: Version 20160209, fixed bufsize 32768, s/g segs 256
[ 85.900000] sd 0:0:5:0: Attached scsi generic sg0 type 0
Signed-off-by: Daniel Palmer<daniel@xxxxxxxx>
---
drivers/scsi/wd33c93.c | 10 +++++++---
drivers/scsi/wd33c93.h | 2 ++
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/wd33c93.c b/drivers/scsi/wd33c93.c
index a44b60c9004a..9789d852d541 100644
--- a/drivers/scsi/wd33c93.c
+++ b/drivers/scsi/wd33c93.c
@@ -733,7 +733,7 @@ transfer_bytes(const wd33c93_regs regs, struct scsi_cmnd *cmd,
void
wd33c93_intr(struct Scsi_Host *instance)
{
- struct scsi_pointer *scsi_pointer;
+ struct scsi_pointer *scsi_pointer = NULL;
struct WD33C93_hostdata *hostdata =
(struct WD33C93_hostdata *) instance->hostdata;
const wd33c93_regs regs = hostdata->regs;
@@ -752,7 +752,9 @@ wd33c93_intr(struct Scsi_Host *instance)
#endif
cmd = (struct scsi_cmnd *) hostdata->connected; /* assume we're connected */
- scsi_pointer = WD33C93_scsi_pointer(cmd);
+ /* cmd could be null */
+ if (cmd)
+ scsi_pointer = WD33C93_scsi_pointer(cmd);
sr = read_wd33c93(regs, WD_SCSI_STATUS); /* clear the interrupt */
phs = read_wd33c93(regs, WD_COMMAND_PHASE);
@@ -828,8 +830,10 @@ wd33c93_intr(struct Scsi_Host *instance)
(struct scsi_cmnd *) hostdata->selecting;
hostdata->selecting = NULL;
- /* construct an IDENTIFY message with correct disconnect bit */
+ /* cmd should now be valid and we can get scsi_pointer */
+ scsi_pointer = WD33C93_scsi_pointer(cmd);
+ /* construct an IDENTIFY message with correct disconnect bit */
hostdata->outgoing_msg[0] = IDENTIFY(0, cmd->device->lun);
if (scsi_pointer->phase)
hostdata->outgoing_msg[0] |= 0x40;
diff --git a/drivers/scsi/wd33c93.h b/drivers/scsi/wd33c93.h
index e5e4254b1477..898c1c7d024d 100644
--- a/drivers/scsi/wd33c93.h
+++ b/drivers/scsi/wd33c93.h
@@ -259,6 +259,8 @@ struct WD33C93_hostdata {
static inline struct scsi_pointer *WD33C93_scsi_pointer(struct scsi_cmnd *cmd)
{
+ WARN_ON_ONCE(!cmd);
+
return scsi_cmd_priv(cmd);
}