Re: Errors on an SSD drive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Chris Murphy wrote:
On Wed, Aug 9, 2017, 11:55 AM Mark Haney <mark.haney@xxxxxxxxxxx> wrote:

To be honest, I'd not try a btrfs volume on a notebook SSD. I did that on a
couple of systems and it corrupted pretty quickly. I'd stick with xfs/ext4

if you manage to get the drive working again.


Sounds like a hardware problem. Btrfs is explicitly optimized for SSD, the
maintainers worked for FusionIO for several years of its development. If
the drive is silently corrupting data, Btrfs will pretty much immediately
start complaining where other filesystems will continue. Bad RAM can also
result in scary warnings where you don't with other filesytems. And I've
been using it in numerous SSDs for years and NVMe for a year with zero
problems.

That´s one thing I´ve been wondering about:  When using btrfs RAID, do you
need to somehow monitor the disks to see if one has failed?

On CentOS though, I'd get newer btrfs-progs RPM from Fedora, and use either
an elrepo.org kernel, a Fedora kernel, or build my own latest long-term
from kernel.org. There's just too much development that's happened since
the tree found in RHEL/CentOS kernels.

I can´t go with a more recent kernel version before NVIDIA has updated their
drivers to no longer need fence.h (or what it was).

And I thought stuff gets backported, especially things as important as file
systems.

Also FWIW Red Hat is deprecating Btrfs, in the RHEL 7.4 announcement.
Support will be removed probably in RHEL 8. I have no idea how it'll affect
CentOS kernels though. It will remain in Fedora kernels.

That would suck badly to the point at which I´d have to look for yet another
distribution.  The only one ramaining is arch.

What do they suggest as a replacement?  The only other FS that comes close is
ZFS, and removing btrfs alltogether would be taking living in the past too many
steps too far.

Anyway, blkdiscard can be used on an SSD, whole or partition to zero them
out. And at least recent ext4 and XFS mkfs will do a blkdisard, same as
mksfs.btrfs.


Chris Murphy






<
https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon

Virus-free.
www.avast.com
<
https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Wed, Aug 9, 2017 at 1:48 PM, hw <hw@xxxxxxxx> wrote:

Robert Moskowitz wrote:

I am building a new system using an Kingston 240GB SSD drive I pulled
from my notebook (when I had to upgrade to a 500GB SSD drive).  Centos
install went fine and ran for a couple days then got errors on the
console.  Here is an example:

[168176.995064] sd 0:0:0:0: [sda] tag#14 FAILED Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[168177.004050] sd 0:0:0:0: [sda] tag#14 CDB: Read(10) 28 00 01 04 68 b0
00 00 08 00
[168177.011615] blk_update_request: I/O error, dev sda, sector 17066160
[168487.534510] sd 0:0:0:0: [sda] tag#17 FAILED Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[168487.543576] sd 0:0:0:0: [sda] tag#17 CDB: Read(10) 28 00 01 04 68 b0
00 00 08 00
[168487.551206] blk_update_request: I/O error, dev sda, sector 17066160
[168787.813941] sd 0:0:0:0: [sda] tag#20 FAILED Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[168787.822951] sd 0:0:0:0: [sda] tag#20 CDB: Read(10) 28 00 01 04 68 b0
00 00 08 00
[168787.830544] blk_update_request: I/O error, dev sda, sector 17066160

Eventually, I could not do anything on the system.  Not even a 'reboot'.
I had to do a cold power cycle to bring things back.

Is there anything to do about this or trash the drive and start anew?


Make sure the cables and power supply are ok.  Try the drive in another
machine
that has a different controller to see if there is an incompatibility
between
the drive and the controller.

You could make a btrfs file system on the whole device: that should say
that
a trim operation is performed for the whole device.  Maybe that helps.

If the errors persist, replace the drive.  I悲 use Intel SSDs because they
seam to have the least problems with broken firmwares.  Do not use SSDs
with
hardware RAID controllers unless the SSDs were designed for this
application.


_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
https://lists.centos.org/mailman/listinfo/centos




--
[image: photo]
Mark Haney
Network Engineer at NeoNova
919-460-3330 <(919)%20460-3330> (opt 1) • mark.haney@xxxxxxxxxxx
www.neonova.net <https://neonova.net/>
<https://www.facebook.com/NeoNovaNNS/>  <https://twitter.com/NeoNova_NNS>
<http://www.linkedin.com/company/neonova-network-services>
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
https://lists.centos.org/mailman/listinfo/centos

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
https://lists.centos.org/mailman/listinfo/centos


_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
https://lists.centos.org/mailman/listinfo/centos




[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]


  Powered by Linux