Repetitive catastrophic failure of a SSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



First of all, I'm not sure I'm writing to right place, but it's my best
guess. If it's not, I'm sorry for the noise, and I'll be happy to be
re-routed to the proper person/mailing-list.

I'll detail my story at the end of this mail, but as a quick summary:

* Got an Asus ROG G551JW, including a Kingston smsm151s3128gd SSD
* SSD running Ubuntu failed after 1 month, got replaced by support
* failed again in the very same way 4 month later.

It's possible that I've been unlucky and got two bad units, but there
is also the possibility that this model of SSD has some intrinsic
defect / peculiarity which is handled by the OS it's provided with,
but not by Linux.

In this case I guess I can ben helpful to help improve the linux driver
handling this SSD and prevent other users to brick their SSD as I did.

(my experience with kernel development is minimal but I'm a good
background in system programming, and I was used to compile my own
kernels, when I was young :) )

Detailed story:

So I've bought this Asus ROG G551JW five month ago. It has a HDD (/dev/sda) , a SSD (/dev/sdb) and a DVD drive (/dev/sdc) .

Upon acquisition I immediately wiped out Windows from the SSD and installed Ubuntu 15.04 instead. As I don't have huge space requirements at this point I left the HDD alone, running everything from the SSD. I simply used the "use the whole disk" option while installing.

After exactly 1 month of satisfying experience, the SSD just failed. It started with every chrome tabs crashing. When I reached a terminal to see what was going on, all I saw was some "INPUT/OUTPUT" errors. I switched the machine off, then on again. After that, the system never booted again: after a long time, the EFI/Bios was displaying a "NO SYSTEM DISK" message or was going directly to its settings menu. In it, sometimes the SSD was listed (rarely), sometimes it was just absent (mostly), while the HDD and DVD were listed.

After much difficulties (random stalls, reboots...), I managed to boot a live session and install a new system on the HDD, to investigate what was going on. After that, the boot process still took ages to go from the EFI/Bios to grub (I guess it was trying to probe the failing SSD, and finally timed-out...) but was otherwise quite normal. Once logged, /dev/sdb (the SSD) sometimes showed up, sometimes was missing, randomly at each boot. When there, it was impossible to extract any information from it (smartctrl -a showed nothing interesting, hexdump -C /dev/sdb returned immediatly without printing anything).

dmsg showed some interesting stuff but I wasn't able to google anything interesting from it except: "your drive is dead dude, get over it". (typical output here: http://pastebin.com/v7eJxmg9 ) So I called the support, explained the situation, send them the computer and got it back in no time with a brand new SSD inside. So I wiped Windows again and reinstalled Ubuntu, all on the SSD, exactly like the first time. During the next 4 months, I had some worrying warnings, like this one time when my root partition was remounted in read-only on errors, but after a reboot everything was fine. I also noticed that IO were slugish when the SSD was almost full (with sometimes seconds of stalls on reads/writes), so I always kept the free space on the drive at ~20%.

But last week, the exact same failure happened again. Same input/output errors, no SSD listed in the EFI/Bios menu, same messages in dmesg. At this point, it's still possible that I'm very, very, very unlucky (I haven't found uproars of users of this laptop online so I don't think it's a defect affecting all the series...) and had 2 defective units in a row, but I doubt it.

Right now I'm not really sure what I want to do. I may call the support again and got it replaced again but If I can't use this SSD reliably it's pretty useless. I could simply remove it and only use the HDD, but unfortunately it's not a standard disk form factor and I fear that removing it will void the warranty.

If anyone has any idea on what is going on, what diagnoses I could run to get more insights, or who I could contact, I'll be grateful.

--
Marc
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux