On 12/13/20 3:41 PM, Eyal Lebedinsky wrote:
I am not sure which list this should go to, so I am starting here.
I run f32 fully updated
5.9.13-100.fc32.x86_64
on relatively new hardware
kernel: DMI: Gigabyte Technology Co., Ltd. Z390 UD/Z390 UD, BIOS F8
05/24/2019
boot/root/swap/data is on nvme
WD Blue SN550 1TB M.2 2280 NVMe SSD WDS100T2B0C
For the second time this disk stopped working (first was about two
months ago).
It seems that the disk failed hard and could not be reset, the machine
was powered off/on.
I think (not sure) that last time I just hit the reset button but it did
not boot.
The machine was booted (after dnf update) around 8pm, and crashed at 4am.
Following the earlier crash a serial console was set up which is how I
can see the failure messages.
== nvme related messages
[ 7.488638] nvme nvme0: pci function 0000:06:00.0
[ 7.536593] nvme nvme0: allocated 32 MiB host memory buffer.
[ 7.541819] nvme nvme0: 8/0/0 default/read/poll queues
[ 7.558122] nvme0n1: p1 p2 p3 p4
[ 19.590010] EXT4-fs (nvme0n1p3): mounted filesystem with ordered data
mode. Opts: (null)
[ 20.653500] Adding 16777212k swap on /dev/nvme0n1p2. Priority:-2
extents:1 across:16777212k SSFS
[ 20.820539] EXT4-fs (nvme0n1p3): re-mounted. Opts: (null)
[ 23.137206] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data
mode. Opts: (null)
[ 23.210717] EXT4-fs (nvme0n1p4): mounted filesystem with ordered data
mode. Opts: (null)
## nothing unusual for 8 hours, then
[28972.459036] nvme nvme0: I/O 840 QID 6 timeout, aborting
[28972.464757] nvme nvme0: I/O 565 QID 7 timeout, aborting
[28972.470277] nvme nvme0: I/O 566 QID 7 timeout, aborting
[28973.291025] nvme nvme0: I/O 989 QID 1 timeout, aborting
[28978.603061] nvme nvme0: I/O 990 QID 1 timeout, aborting
[29002.667243] nvme nvme0: I/O 840 QID 6 timeout, reset controller
[29032.875421] nvme nvme0: I/O 24 QID 0 timeout, reset controller
[29074.097644] nvme nvme0: Device not ready; aborting reset, CSTS=0x1
[29074.110354] nvme nvme0: Abort status: 0x371
[29074.114953] nvme nvme0: Abort status: 0x371
[29074.119523] nvme nvme0: Abort status: 0x371
[29074.124114] nvme nvme0: Abort status: 0x371
[29074.128710] nvme nvme0: Abort status: 0x371
[29096.645478] nvme nvme0: Device not ready; aborting reset, CSTS=0x1
[29096.652210] nvme nvme0: Removing after probe failure status: -19
[29119.165921] nvme nvme0: Device not ready; aborting reset, CSTS=0x1
## many I/O errors on nvme0 (p2/p3/p4) repeating until a reboot at 8:30am
## one different message, appearing just once:
[29123.800844] nvme nvme0: failed to set APST feature (-19)
The setup is:
/dev/nvme0n1p1 976M 381M 528M 42% /boot
/dev/nvme0n1p3 204G 62G 131G 33% /
/dev/nvme0n1p4 696G 31G 630G 5% /data
Hi Eyal,
Probably not what you want to or even asked, but I
will not sell ANY ssd drive except those from Samsung.
The other vendors have given me too much greif.
What I would do is get a Samsung drive and try again.
If you have two NVMe slots, you can use Clonezilla
with the advanced rescue switch.
:'(
-T
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx