Re: Transmitting payload and ATA commands simultaneously messes up connection with USB SATA bridge

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 19, 2010 at 07:00:54PM +0100, Richard Schütz wrote:
> On 12/19/10 18:04, Sergey Vlasov wrote:
> > Could you also get an usbmon trace when you first issue "smartctl -d
> > sat" on an otherwise idle device, and then do some writes to it?
> > Maybe the device returns a wrong residue even in this case, but then
> > subsequent commands issued by smartctl reset the bad state.
> 
> I did "smartctl -d sat -a" on the unmounted drive, then mounted it and 
> wrote some data on it: http://richard.qasl.de/3.mon.out

Looking at this file:

> ffff8800aea3f240 718957672 S Bo:2:006:2 -115 31 = 55534243 ba000000 00020000 80001085 080e0000 00010000 00000000 00ec00
> ffff8800aea3f240 718957810 C Bo:2:006:2 0 31 >
> ffff8800bfb2ba80 718957835 S Bi:2:006:1 -115 512 <
> ffff8800bfb2ba80 718975331 C Bi:2:006:1 0 512 = 7a42ff3f 37c81000 00000000 3f000000 00000000 20202020 57202d44 4d575a41
> ffff8800aea3f240 718975365 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 718975435 C Bi:2:006:1 0 13 = 55534253 ba000000 00020000 00

ATA_16/IDENTIFY, and the device reports a 512 byte residue, as in the
previous log.

> ffff8800aea3f240 718998178 S Bo:2:006:2 -115 31 = 55534243 bb000000 00020000 80001085 080e00d0 00010000 004f00c2 00b000
> ffff8800aea3f240 718998306 C Bo:2:006:2 0 31 >
> ffff8800bfb2b0c0 718998314 S Bi:2:006:1 -115 512 <
> ffff8800bfb2b0c0 719017820 C Bi:2:006:1 0 512 = 1000012f 00c8c800 00000000 00000327 00bfa74a 15000000 00000432 00646434
> ffff8800aea3f240 719017828 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 719017945 C Bi:2:006:1 0 13 = 55534253 bb000000 00020000 00

ATA_16/SMART/READ_VALUES (again a PIO Data-in command), the same 512
byte residue.

> ffff8800aea3f240 719018015 S Bo:2:006:2 -115 31 = 55534243 bc000000 00020000 80001085 080e00d1 00010001 004f00c2 00b000
> ffff8800aea3f240 719018069 C Bo:2:006:2 0 31 >
> ffff8800bfb2b0c0 719018076 S Bi:2:006:1 -115 512 <
> ffff8800bfb2b0c0 719038208 C Bi:2:006:1 0 512 = 10000133 c8c8c8c8 c8000000 00000315 00000000 00000000 00000400 00000000
> ffff8800aea3f240 719038245 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 719038310 C Bi:2:006:1 0 13 = 55534253 bc000000 00020000 00

ATA_16/SMART/READ_THRESHOLDS, 512 byte residue...

> ffff8800aea3f240 719038455 S Bo:2:006:2 -115 31 = 55534243 bd000000 00000000 00001085 062c00da 00000000 004f00c2 00b000
> ffff8800aea3f240 719038578 C Bo:2:006:2 0 31 >
> ffff8800aea3f240 719038595 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 719155963 C Bi:2:006:1 0 13 = 55534253 bd000000 00000000 00

ATA_16/SMART/STATUS, this command does not transfer any data (SAT
protocol code = 3, Non-data); now residue is 0 (and any non-zero
residue with zero data size would be an obvious bug).

> ffff8800aea3f240 719157001 S Bo:2:006:2 -115 31 = 55534243 be000000 00020000 80001085 080e00d5 00010001 004f00c2 00b000
> ffff8800aea3f240 719157079 C Bo:2:006:2 0 31 >
> ffff88010dc89600 719157101 S Bi:2:006:1 -115 512 <
> ffff88010dc89600 719172556 C Bi:2:006:1 0 512 = 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ffff8800aea3f240 719172603 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 719172694 C Bi:2:006:1 0 13 = 55534253 be000000 00020000 00

ATA_16/SMART/READ_LOG_SECTOR, 512 byte residue is back.

> ffff8800aea3f240 719172762 S Bo:2:006:2 -115 31 = 55534243 bf000000 00020000 80001085 080e00d5 00010006 004f00c2 00b000
> ffff8800aea3f240 719172805 C Bo:2:006:2 0 31 >
> ffff8800bfb2b0c0 719172822 S Bi:2:006:1 -115 512 <
> ffff8800bfb2b0c0 719189307 C Bi:2:006:1 0 512 = 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ffff8800aea3f240 719189327 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 719189444 C Bi:2:006:1 0 13 = 55534253 bf000000 00020000 00

ATA_16/SMART/READ_LOG_SECTOR (with different parameters), 512 byte
residue...

> ffff8800aea3f240 719189527 S Bo:2:006:2 -115 31 = 55534243 c0000000 00020000 80001085 080e00d5 00010009 004f00c2 00b000
> ffff8800aea3f240 719189555 C Bo:2:006:2 0 31 >
> ffff8801375ed480 719189572 S Bi:2:006:1 -115 512 <
> ffff8801375ed480 719205574 C Bi:2:006:1 0 512 = 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ffff8800aea3f240 719205584 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 719205681 C Bi:2:006:1 0 13 = 55534253 c0000000 00020000 00

More ATA_16/SMART/READ_LOG_SECTOR, more 512 byte residue...

> ffff8800aea3f240 733148055 S Bo:2:006:2 -115 31 = 55534243 c1000000 00040000 80000a28 00000008 02000002 00000000 000000
> ffff8800aea3f240 733148141 C Bo:2:006:2 0 31 >
> ffff88009ff3dc00 733148160 S Bi:2:006:1 -115 1024 <
> ffff88009ff3dc00 733148261 C Bi:2:006:1 0 1024 = 00607505 000ed515 00000000 d5f1d410 74387505 00000000 02000000 02000000
> ffff8800aea3f240 733148281 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 733148386 C Bi:2:006:1 0 13 = 55534253 c1000000 00000000 00

READ_10 - and now the residue is 0, as it should be.

> ffff8800aea3f240 733148959 S Bo:2:006:2 -115 31 = 55534243 c2000000 00100000 80000a28 00000008 00000008 00000000 000000
> ffff8800aea3f240 733149012 C Bo:2:006:2 0 31 >
> ffff88011f578540 733149021 S Bi:2:006:1 -115 4096 <
> ffff88011f578540 733149264 C Bi:2:006:1 0 4096 = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ffff8800aea3f240 733149270 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 733149386 C Bi:2:006:1 0 13 = 55534253 c2000000 00000000 00
> ffff8800aea3f240 733149424 S Bo:2:006:2 -115 31 = 55534243 c3000000 00100000 80000a28 00000008 08000008 00000000 000000
> ffff8800aea3f240 733149513 C Bo:2:006:2 0 31 >
> ffff88011f578e40 733149519 S Bi:2:006:1 -115 4096 <
> ffff88011f578e40 733149636 C Bi:2:006:1 0 4096 = 01040000 11040000 21040000 b803f51f 02000400 00000000 00000000 f51fc9b2
> ffff8800aea3f240 733149644 S Bi:2:006:1 -115 13 <
> ffff8800aea3f240 733149763 C Bi:2:006:1 0 13 = 55534253 c3000000 00000000 00

More READ_10 commands, works fine.

Note that the first command after SMART commands in this case is READ,
not WRITE.  Maybe only write commands in the broken state are
mishandled - but even if this is the case, it would be hard to work
around in kernel (injecting dummy commands does not look good).  And
residue values on all ATA_16 commands which read data from the drive
are obviously bad - but smartctl ignores that field.

> > One possible option is to ignore the bad residue by using the
> > US_FL_IGNORE_RESIDUE flag for the device - the 0x152d, 0x2329 JMicron
> > bridge is already in drivers/usb/storage/unusual_devs.h with quirk
> > flags US_FL_IGNORE_RESIDUE | US_FL_SANE_SENSE, maybe this device needs
> > such workarounds too.  This can be tested without rebuilding the
> > kernel by specifying an additional parameter for the usb-storage
> > module:
> >
> >    quirks=1e68:001b:ar
> >
> > ("echo 1e68:001b:ar>  /sys/module/usb_storage/parameters/quirks"
> > should work even without reloading the module).
> 
> This seems to work. I can write data on the drive and query SMART 
> simultaneously without tons of errors and interruption.

Looks like another patch for more broken hardware is in order...
Sometimes outside-label-specific vendor/product IDs do more harm than
good (instead of a single entry for the particular broken USB-ATA
chip, a separate entry for every manufacturer which used its own
vendor ID is needed).

Note that entries in drivers/usb/storage/unusual_devs.h should be
sorted by vendor and product ID.

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux