On Sun, Dec 19, 2010 at 07:00:54PM +0100, Richard Schütz wrote: > On 12/19/10 18:04, Sergey Vlasov wrote: > > Could you also get an usbmon trace when you first issue "smartctl -d > > sat" on an otherwise idle device, and then do some writes to it? > > Maybe the device returns a wrong residue even in this case, but then > > subsequent commands issued by smartctl reset the bad state. > > I did "smartctl -d sat -a" on the unmounted drive, then mounted it and > wrote some data on it: http://richard.qasl.de/3.mon.out Looking at this file: > ffff8800aea3f240 718957672 S Bo:2:006:2 -115 31 = 55534243 ba000000 00020000 80001085 080e0000 00010000 00000000 00ec00 > ffff8800aea3f240 718957810 C Bo:2:006:2 0 31 > > ffff8800bfb2ba80 718957835 S Bi:2:006:1 -115 512 < > ffff8800bfb2ba80 718975331 C Bi:2:006:1 0 512 = 7a42ff3f 37c81000 00000000 3f000000 00000000 20202020 57202d44 4d575a41 > ffff8800aea3f240 718975365 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 718975435 C Bi:2:006:1 0 13 = 55534253 ba000000 00020000 00 ATA_16/IDENTIFY, and the device reports a 512 byte residue, as in the previous log. > ffff8800aea3f240 718998178 S Bo:2:006:2 -115 31 = 55534243 bb000000 00020000 80001085 080e00d0 00010000 004f00c2 00b000 > ffff8800aea3f240 718998306 C Bo:2:006:2 0 31 > > ffff8800bfb2b0c0 718998314 S Bi:2:006:1 -115 512 < > ffff8800bfb2b0c0 719017820 C Bi:2:006:1 0 512 = 1000012f 00c8c800 00000000 00000327 00bfa74a 15000000 00000432 00646434 > ffff8800aea3f240 719017828 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 719017945 C Bi:2:006:1 0 13 = 55534253 bb000000 00020000 00 ATA_16/SMART/READ_VALUES (again a PIO Data-in command), the same 512 byte residue. > ffff8800aea3f240 719018015 S Bo:2:006:2 -115 31 = 55534243 bc000000 00020000 80001085 080e00d1 00010001 004f00c2 00b000 > ffff8800aea3f240 719018069 C Bo:2:006:2 0 31 > > ffff8800bfb2b0c0 719018076 S Bi:2:006:1 -115 512 < > ffff8800bfb2b0c0 719038208 C Bi:2:006:1 0 512 = 10000133 c8c8c8c8 c8000000 00000315 00000000 00000000 00000400 00000000 > ffff8800aea3f240 719038245 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 719038310 C Bi:2:006:1 0 13 = 55534253 bc000000 00020000 00 ATA_16/SMART/READ_THRESHOLDS, 512 byte residue... > ffff8800aea3f240 719038455 S Bo:2:006:2 -115 31 = 55534243 bd000000 00000000 00001085 062c00da 00000000 004f00c2 00b000 > ffff8800aea3f240 719038578 C Bo:2:006:2 0 31 > > ffff8800aea3f240 719038595 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 719155963 C Bi:2:006:1 0 13 = 55534253 bd000000 00000000 00 ATA_16/SMART/STATUS, this command does not transfer any data (SAT protocol code = 3, Non-data); now residue is 0 (and any non-zero residue with zero data size would be an obvious bug). > ffff8800aea3f240 719157001 S Bo:2:006:2 -115 31 = 55534243 be000000 00020000 80001085 080e00d5 00010001 004f00c2 00b000 > ffff8800aea3f240 719157079 C Bo:2:006:2 0 31 > > ffff88010dc89600 719157101 S Bi:2:006:1 -115 512 < > ffff88010dc89600 719172556 C Bi:2:006:1 0 512 = 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > ffff8800aea3f240 719172603 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 719172694 C Bi:2:006:1 0 13 = 55534253 be000000 00020000 00 ATA_16/SMART/READ_LOG_SECTOR, 512 byte residue is back. > ffff8800aea3f240 719172762 S Bo:2:006:2 -115 31 = 55534243 bf000000 00020000 80001085 080e00d5 00010006 004f00c2 00b000 > ffff8800aea3f240 719172805 C Bo:2:006:2 0 31 > > ffff8800bfb2b0c0 719172822 S Bi:2:006:1 -115 512 < > ffff8800bfb2b0c0 719189307 C Bi:2:006:1 0 512 = 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > ffff8800aea3f240 719189327 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 719189444 C Bi:2:006:1 0 13 = 55534253 bf000000 00020000 00 ATA_16/SMART/READ_LOG_SECTOR (with different parameters), 512 byte residue... > ffff8800aea3f240 719189527 S Bo:2:006:2 -115 31 = 55534243 c0000000 00020000 80001085 080e00d5 00010009 004f00c2 00b000 > ffff8800aea3f240 719189555 C Bo:2:006:2 0 31 > > ffff8801375ed480 719189572 S Bi:2:006:1 -115 512 < > ffff8801375ed480 719205574 C Bi:2:006:1 0 512 = 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > ffff8800aea3f240 719205584 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 719205681 C Bi:2:006:1 0 13 = 55534253 c0000000 00020000 00 More ATA_16/SMART/READ_LOG_SECTOR, more 512 byte residue... > ffff8800aea3f240 733148055 S Bo:2:006:2 -115 31 = 55534243 c1000000 00040000 80000a28 00000008 02000002 00000000 000000 > ffff8800aea3f240 733148141 C Bo:2:006:2 0 31 > > ffff88009ff3dc00 733148160 S Bi:2:006:1 -115 1024 < > ffff88009ff3dc00 733148261 C Bi:2:006:1 0 1024 = 00607505 000ed515 00000000 d5f1d410 74387505 00000000 02000000 02000000 > ffff8800aea3f240 733148281 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 733148386 C Bi:2:006:1 0 13 = 55534253 c1000000 00000000 00 READ_10 - and now the residue is 0, as it should be. > ffff8800aea3f240 733148959 S Bo:2:006:2 -115 31 = 55534243 c2000000 00100000 80000a28 00000008 00000008 00000000 000000 > ffff8800aea3f240 733149012 C Bo:2:006:2 0 31 > > ffff88011f578540 733149021 S Bi:2:006:1 -115 4096 < > ffff88011f578540 733149264 C Bi:2:006:1 0 4096 = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > ffff8800aea3f240 733149270 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 733149386 C Bi:2:006:1 0 13 = 55534253 c2000000 00000000 00 > ffff8800aea3f240 733149424 S Bo:2:006:2 -115 31 = 55534243 c3000000 00100000 80000a28 00000008 08000008 00000000 000000 > ffff8800aea3f240 733149513 C Bo:2:006:2 0 31 > > ffff88011f578e40 733149519 S Bi:2:006:1 -115 4096 < > ffff88011f578e40 733149636 C Bi:2:006:1 0 4096 = 01040000 11040000 21040000 b803f51f 02000400 00000000 00000000 f51fc9b2 > ffff8800aea3f240 733149644 S Bi:2:006:1 -115 13 < > ffff8800aea3f240 733149763 C Bi:2:006:1 0 13 = 55534253 c3000000 00000000 00 More READ_10 commands, works fine. Note that the first command after SMART commands in this case is READ, not WRITE. Maybe only write commands in the broken state are mishandled - but even if this is the case, it would be hard to work around in kernel (injecting dummy commands does not look good). And residue values on all ATA_16 commands which read data from the drive are obviously bad - but smartctl ignores that field. > > One possible option is to ignore the bad residue by using the > > US_FL_IGNORE_RESIDUE flag for the device - the 0x152d, 0x2329 JMicron > > bridge is already in drivers/usb/storage/unusual_devs.h with quirk > > flags US_FL_IGNORE_RESIDUE | US_FL_SANE_SENSE, maybe this device needs > > such workarounds too. This can be tested without rebuilding the > > kernel by specifying an additional parameter for the usb-storage > > module: > > > > quirks=1e68:001b:ar > > > > ("echo 1e68:001b:ar> /sys/module/usb_storage/parameters/quirks" > > should work even without reloading the module). > > This seems to work. I can write data on the drive and query SMART > simultaneously without tons of errors and interruption. Looks like another patch for more broken hardware is in order... Sometimes outside-label-specific vendor/product IDs do more harm than good (instead of a single entry for the particular broken USB-ATA chip, a separate entry for every manufacturer which used its own vendor ID is needed). Note that entries in drivers/usb/storage/unusual_devs.h should be sorted by vendor and product ID.
Attachment:
signature.asc
Description: Digital signature