Re: [Bug 12120] [Block layer or SCSI] requests aborted too early during check_partition()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29 Nov, James Bottomley wrote:
> OK, so there are a few problems.  First, by responding OK to the test
> unit ready (which is illegal under spec) it avoids the spin up the sd
> driver normally does, so we're relying on the eh allow_restart flag to
> start the unit on the first failing command.  Then, in the failure case:
> 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00
>> 08 00
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Sense Key : Not Ready [current] 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: Logical unit not ready,
>> initializing command required
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 0
>> Nov 29 19:57:06 stein Waking error handler thread
>> Nov 29 19:57:06 stein Error handler scsi_eh_20 waking up
>> Nov 29 19:57:06 stein sd 20:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
>> cancel: 0
>> Nov 29 19:57:06 stein Total of 1 commands on 1 devices require eh work
>> Nov 29 19:57:06 stein scsi_eh_20: Sending START_UNIT to sdev: 0xe58fc7f0
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
>> 00
>> Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 SUCCESS
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
>> 00
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Unrecognized sense data (in hex):
>> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>> Nov 29 19:57:06 stein Sense Key : No Sense [current] 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: No additional sense
>> information
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 1
> 
> The second start unit is a failure ... I suspect because of our change
> to no sense return handling.  What the drive is probably trying to say
> is (I'm spinning up) but this gets interpreted as an error because the
> sense data for this isn't present (because we didn't ask for it).
> 
> Can you try this patch?  It should take the success return of the first
> spin up and act on it instead of blindly sending another.
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 3863617..635d8b4 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -931,12 +931,15 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
>  	if (scmd->device->allow_restart) {
>  		int i, rtn = NEEDS_RETRY;
>  
> -		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
> +		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++) {
>  			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
>  						scmd->device->timeout, 0);
>  
> -		if (rtn == SUCCESS)
> -			return 0;
> +			if (rtn == SUCCESS)
> +				return 0;
> +			/* if failure, wait before retrying */
> +			ssleep(3);
> +		}
>  	}
>  
>  	return 1;
> 
> 

Still no luck.  Here is a log from a simpler test:
# sg_start --stop /dev/sdb
# echo 0xfffffff > /sys/module/scsi_mod/parameters/scsi_logging_level
# hdparm -tT /dev/sdb
This works under 2.6.27.y (i.e. disk spins up, hdparm proceeds) but
fails under 2.6.28-rc with disk put offline.

The firmware apparently doesn't like the command abortion (SBP-2 fetch
agent reset) while it is spinning up.


Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] sd_init_command: block=0, count=512
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] block=0
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] reading 512/512 512 byte blocks.
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:12:53 mini buffer = 0xffff88007b60c5f0, bufflen = 262144, queuecommand 0xffffffffa015d445
Nov 29 22:12:53 mini leaving scsi_dispatch_cmnd()
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 FAILED
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Sense Key : Not Ready [current] 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: Logical unit not ready, initializing command required
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 22:12:53 mini Waking error handler thread
Nov 29 22:12:53 mini Error handler scsi_eh_2 waking up
Nov 29 22:12:53 mini sd 2:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1, cancel: 0
Nov 29 22:12:53 mini Total of 1 commands on 1 devices require eh work
Nov 29 22:12:53 mini scsi_eh_2: Sending START_UNIT to sdev: 0xffff8800712ef000
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa015d445
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:12:53 mini Sense Key : No Sense [current] 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense information
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:12:53 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 0
Nov 29 22:12:53 mini ieee1394: sbp2: aborting sbp2 command
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini scsi_eh_done scmd: ffff88007b76af08 result: 50000
Nov 29 22:12:55 mini ieee1394: sbp2: Reconnected to SBP-2 device
Nov 29 22:12:55 mini ieee1394: sbp2: Node 0-01:1023: Max speed [S400] - Max payload [2048]
Nov 29 22:12:56 mini scsi_eh_2: START_UNIT failed to sdev: 0xffff8800712ef000
Nov 29 22:12:56 mini scsi_eh_2: Sending BDR sdev: 0xffff8800712ef000
Nov 29 22:12:56 mini ieee1394: sbp2: reset requested
Nov 29 22:12:56 mini ieee1394: sbp2: generating sbp2 fetch agent reset
Nov 29 22:12:56 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:56 mini sd 2:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 22:12:56 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa015d445
Nov 29 22:13:02 mini ieee1394: sbp2: Reconnected to SBP-2 device
Nov 29 22:13:02 mini ieee1394: sbp2: Node 0-01:1023: Max speed [S400] - Max payload [2048]
Nov 29 22:13:02 mini scsi_eh_done scmd: ffff88007b76af08 result: 20000
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:13:02 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 4840
Nov 29 22:13:02 mini scsi_send_eh_cmnd: scsi_eh_completed_normally 2003
Nov 29 22:13:02 mini scsi_eh_tur: scmd ffff88007b76af08 rtn 2003
Nov 29 22:13:02 mini scsi_eh_2: Sending target reset to target 0
Nov 29 22:13:02 mini scsi_eh_2: Target reset failed target: 0
Nov 29 22:13:02 mini scsi_eh_2: Sending BRST chan: 0
Nov 29 22:13:02 mini scsi_try_bus_reset: Snd Bus RST
Nov 29 22:13:02 mini scsi_eh_2: BRST failed chan: 0
Nov 29 22:13:02 mini scsi_eh_2: Sending HRST
Nov 29 22:13:02 mini scsi_try_host_reset: Snd Host RST
Nov 29 22:13:02 mini scsi_eh_2: HRST failed
Nov 29 22:13:02 mini sd 2:0:0:0: Device offlined - not ready after error recovery
Nov 29 22:13:02 mini scsi_eh_2: flush finish cmd: ffff88007b76af08
Nov 29 22:13:02 mini sd 2:0:0:0: Notifying upper driver of completion (result 2)
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini 512 sectors total, 0 bytes done.
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini end_request: I/O error, dev sdb, sector 0
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 1
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 2
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 3
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 4
Nov 29 22:13:02 mini sd 2:0:0:0: rejecting I/O to offline device
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 5
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 6
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 7
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 8
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:13:02 mini scsi_restart_operations: waking up host to restart
Nov 29 22:13:02 mini Error handler scsi_eh_2 sleeping

-- 
Stefan Richter
-=====-==--- =-== ===-=
http://arcgraph.de/sr/

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux