Re: Investigating potential flaw in scsi error handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2008-02-09 at 22:59 +0100, Elias Oltmanns wrote:
> Hi there,
> 
> I'm experiencing system lockups with 2.6.24 which I believe to be
> related to scsi error handling. Actually, I have patched the mainline
> kernel with a disk shock protection patch [1] and in my case it is indeed
> the shock protection mechanism that triggers the lockups. However, some
> rather lengthy investigations have lead me to the conclusion that this
> additional patch is just the means to reproduce the error condition
> fairly reliably rather than the origin of the problem.
> 
> The problem has only become apparent since Tejun's commit
> 31cc23b34913bc173680bdc87af79e551bf8cc0d. More precisely, libata now
> sets max_host_blocked and max_device_blocked to 1 for all ATA devices.
> Various tests I've conducted so far have lead me to the conclusion that
> a non zero return code from scsi_dispatch_command is sufficient to
> trigger the problem I'm seeing provided that max_host_blocked and
> max_device_blocked are set to 1.

There's nothing inherently incorrect with setting max_device_blocked to
1 but it is suboptimal: it means that for a single queue device
returning a wait causes an immediate reissue.

> Unfortunately, I'm a bit at a loss as to how I should proceed to find
> the culprit. I can reliably reproduce the problem using the disk shock
> protection patch in order to cause non zero return values from
> scsi_dispatch_command. How can I find out where in the error handling of
> this condition things might go wrong?
> 
> Most likely you will need further information to help me solving this
> issue but perhaps you can already come up with some suggestions and tell
> me what else you'd like to know.

Well, the first case I'm not sure why you refer to non-zero return from
scsi_dispatch_command() since that's an internal API; the non zero
return should come from ->queuecommand().

However, if you've patched scsi_dispatch_command() I'd guess that would
be the problem.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux