Re: [Bug 200917] 4.18 regression: I/O error on external icybox disk enclosures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 24 Aug 2018, Klaus Kusche wrote:

> 
> On 24/08/2018 17:39, Alan Stern wrote:
> > On Fri, 24 Aug 2018, Klaus Kusche wrote:
> >> On 24/08/2018 16:15, Alan Stern wrote:
> >>> On Fri, 24 Aug 2018, Klaus Kusche wrote:
> >>>> I entered the following USB bug into kernel bugzilla yesterday:
> >>>>
> >>>> https://bugzilla.kernel.org/show_bug.cgi?id=200917
> >>>>
> >>>> "Since 4.18, all my external USB3-to-SATA Icybox disk enclosures with usb Id
> >>>> 357d:7788 (seems to be a very common controller chip: Sharkoon QuickPort XT)
> >>>> fail with the following error when mounting an ext4 fs:
> > 
> > If all else fails, you can try using git-bisect to find the commit
> > which causes the errors.
> 
> Uhh, I'm not sure I have the time to do that...
> (that's something like 15 kernel builds?)

Something like.  The later ones tend to go relatively fast, because not 
much code changes between iterations.

> >>>> print_req_error: critical target error, dev sdd, sector 2048
> >>>> Buffer I/O error on dev sdd1, logical block 0, lost sync page write
> >>>> EXT4-fs (sdd1): I/O error while writing superblock
> >>>> EXT4-fs (sdd1): mount failed
> >>>>
> >>>> - They worked before 4.18.
> >>>> - Reading is definitely ok, async writing seems to work, too.
> >>>> - The problem occurs with several different disks (I only tested HGST drives).
> >>>> - The same disks work in enclosures with other controllers."
> > 
> > It does sound like a bug in the enclosure.
> 
> It is that specific controller chip.
> I tried 3 enclosures with that usb id (all fail since 4.18),
> and 5 enclosures with different usb id's (all still work).
> 
> >>> Please provide the output from "dmesg".
> > 
> >> [ 3692.559336] sd 7:0:0:0: [sdd] 976773168 512-byte logical blocks: (500 GB/466
> >> GiB)
> >> [ 3692.559595] sd 7:0:0:0: [sdd] Write Protect is off
> >> [ 3692.559598] sd 7:0:0:0: [sdd] Mode Sense: 47 00 10 08
> >> [ 3692.559881] sd 7:0:0:0: [sdd] Write cache: enabled, read cache: enabled,
> >> supports DPO and FUA
> >> [ 3692.575820]  sdd: sdd1
> >> [ 3692.576941] sd 7:0:0:0: [sdd] Attached SCSI disk
> >> [ 3725.164065] sd 7:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_OK
> >> driverbyte=DRIVER_SENSE
> >> [ 3725.164071] sd 7:0:0:0: [sdd] tag#0 Sense Key : Illegal Request [current]
> >> [ 3725.164075] sd 7:0:0:0: [sdd] tag#0 Add. Sense: Invalid field in cdb
> >> [ 3725.164080] sd 7:0:0:0: [sdd] tag#0 CDB: Write(10) 2a 08 00 00 08 00 00 00 08 00
> > 
> > This indicates the error occurred shortly after the drive was plugged
> > in.  A usbmon trace might be helpful.  Can you collect and send a trace
> > for bus 4, starting shortly before you plug in the drive and ending
> > after the error occurs?
> I'll try tomorrow (I don't have usbmon installed or configured).

Distribution kernels generally configure usbmon by default.  You don't
need to install anything special to use it, provided it is configured
in the kernel.  In your case, you would just have to do:

	modprobe usbmon
	mount -t debugfs none /sys/kernel/debug
	cat /sys/kernel/debug/usb/usbmon/4u >usb4.out

(substitute whatever name you like for the output file).

> The error does not happen when plugging the drive in.
> It happens when rw-mounting an ext4 fs on the drive
> (as far as I know, the affected sector 2048 is indeed the ext4 superblock).
> Ro-mounting and reading the same ext4 fs in the same enclusoure works fine,
> and a vfat on such a drive can even be rw-mounted and successfully written.
> Hence, obviously rw-mounting an ext4 fs emits some special write command
> which fails with that controller since 4.18.

The command which actually failed was a perfectly standard WRITE(10), 
although it has the FUA (Force Unit Access) bit turned on.  Perhaps 
that caused the problem, or perhaps an earlier command sent the 
controller chip into some sort of error state.

> And a completely different guess: The controller is UAS-blacklisted.
> Do you know if it has always been, or if it was blacklisted with 4.18?
> If linux used UAS before 4.18 and switched to usb-storage in 4.18,
> that could also explain why I never saw that error before...
> (just telling from the speed I'd assume that the enclosures used UAS
> up to now: These are my backup drives, and before 4.18,
> they had sustained read/write transfer rates at physical drive speed,
> around 120 MB/s. I don't think usb-storage is able to do that...).

The blacklist entry was added in 2014 (commit c6fa3945c8b5).  It first 
appeared in kernel version 3.19.

Alan Stern




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux