Re: I/O errors while writing to external Transcend XS-2000 4TB SSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kent Overstreet - 11.02.24, 19:51:32 CET:
> On Sun, Feb 11, 2024 at 06:06:27PM +0100, Martin Steigerwald wrote:
[…]
> > CC'ing BCacheFS mailing list.
> > 
> > My original mail is here:
> > 
> > https://lore.kernel.org/linux-usb/5264d425-fc13-6a77-2dbf-6853479051a0
> > @applied-asynchrony.com/T/ #m5ec9ecad1240edfbf41ad63c7aeeb6aa6ea38a5e
> > 
> > Holger Hoffstätte - 11.02.24, 17:02:29 CET:
> > > On 2024-02-11 16:42, Martin Steigerwald wrote:
> > > > Hi!
> > > > I am trying to put data on an external Kingston XS-2000 4 TB SSD
> > > > using
> > > > self-compiled Linux 6.7.4 kernel and encrypted BCacheFS. I do not
> > > > think BCacheFS has any part in the errors I see, but if you
> > > > disagree
> > > > feel free to CC the BCacheFS mailing list as you reply.
> > > 
> > > This is indeed a known bug with bcachefs on USB-connected devices.
> > > Apply the following commit:
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/c
> > > ommi t/fs/bcachefs?id=3e44f325f6f75078cdcd44cd337f517ba3650d05
> > > 
> > > This and some other commits are already scheduled for -stable.
> > 
> > Thanks!
> > 
> > Oh my. I was aware of some bug fixes coming for stable. I briefly
> > looked through them, but now I did not make a connection.
> > 
> > I will wait for 6.7.5 and retry then I bet.
> 
> That doesn't look related - the device claims to not support flush or
> fua, and the bug resulted in us not sending flush/fua devices; the main
> thing people would see without that patch, on 6.8, would be an immediate
> -EOPNOTSUP on the first flush journal write.
> 
> He only got errors after an hour or so, or 10 minutes with UAS disabled;
> we send flushes once a second. Sounds like a screwy device.

Thanks for that explanation, Kent.

I am the one with that external Transcend XS 2000 4 TB SSD and I
specifically did not CC bcachefs mailing list at the beginning as after
seeing things like

[33963.462694] sd 0:0:0:0: [sda] tag#10 uas_zap_pending 0 uas-tag 1 inflight: CMD 
[33963.462708] sd 0:0:0:0: [sda] tag#10 CDB: Write(16) 8a 00 00 00 00 00 82 c1 bc 00 00 00 04 00 00 00
[…]
[33963.592872] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_RESET driverbyte=DRIVER_OK cmd_age=182s

I thought some quirks in the device to be at fault.

However while Sandisk Extreme Pro 2 TB claims to support DPO and FUA I see

Write cache: disabled, read cache: enabled, doesn't support DPO or FUA

also with other devices like external Toshiba Canvio 4 TB hard disks. Using
LUKS encrypted BTRFS on those I never saw any timeout while writing out
data issue with any of those hard disks. Also with disabled write cache
any cache flush / FUA request should be a no-op anyway? These hard disks
have been doing a ton of backup workloads without any issues, but so far
only with BTRFS.

I may test the Transcend XS2000 with BTRFS to see whether it makes a
difference, however I really like to use it with BCacheFS and I do not really
like to use LUKS for external devices. According to the kernel log I still
don't really think those errors at the block layer were about anything
filesystem specific, but what  do I know?

With UAS enabled for Transcend XS2000 I see:

Write cache: disabled, read cache: enabled, doesn't support DPO or FUA

This sounds about right: Without cache flush / FUA request disable write
cache.

With UAS disabled, using only usb-storage, however I see:

Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Which appears to be broken to me: If it cannot do cache flush / FUA it
should have write cache disabled.

Thus I removed the quirk to disable UAS again. It did not help anyway.

However when I look at the output of "hdparm -I" for that Transcend XS2000
none of this makes sense. Cause it blatantly advertises to support

[…]
           *    Mandatory FLUSH_CACHE
           *    FLUSH_CACHE_EXT
[…]
           *    WRITE_{DMA|MULTIPLE}_FUA_EXT
[…]

It has firmware revision S9K00107. I see whether I can get this updated
in case any update is available. Which is not obvious to me as Kingston
only offers to download a Windows application to update the firmware.

I asked them how to do an update on Linux. But am also prepared to run to
a friend with Windows system to do the update.

There is no urgency in this, so let's see whether a firmware update may
fix anything. In case someone has any additional insight, feel free to add
it. Otherwise I consider it case closed unless I retest with either Linux
kernel 6.7.5 or 6.8-rc4 and/or after having made a firmware update
if available.

Maybe also some other quirks would need to be enabled for that
device? I tested it with:

% cat /etc/modprobe.d/disable-uas.conf
# Does not work with external SSD Transcend XS2000 4TB
options usb-storage quirks=0951:176b:u

but as explained that did not help and thus I disabled UAS disabling
quirk again.

Best,
-- 
Martin







[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux