Re: How to just delete PGs stuck incomplete on EC pool

Daniel K <sathackr@xxxxxxxxx> · Sat, 2 Mar 2019 09:08:18 -0500

They all just started having read errors. Bus resets. Slow reads. Which is one of the reasons the cluster didn't recover fast enough to compensate. 
I tried to be mindful of the drive type and specifically avoided the larger capacity Seagates that are SMR. Used 1 SM863 for every 6 drives for the WAL.

Not sure why they failed. The data isn't critical at this point, just need to get the cluster back to normal.

On Sat, Mar 2, 2019, 9:00 AM  <jesper@xxxxxxxx> wrote:

Did they break, or did something went wronng trying to replace them?

Jespe

Sent from myMail for iOS

Saturday, 2 March 2019, 14.34 +0100 from Daniel K  <sathackr@xxxxxxxxx>:

			I bought the wrong drives trying to be cheap. They were 2TB WD Blue 5400rpm 2.5 inch laptop drives.
They've been replace now with HGST 10K 1.8TB SAS drives.

On Sat, Mar 2, 2019, 12:04 AM  <jesper@xxxxxxxx> wrote:

Saturday, 2 March 2019, 04.20 +0100 from sathackr@xxxxxxxxx  <sathackr@xxxxxxxxx>:

			56 OSD, 6-node 12.2.5 cluster on Proxmox

We had multiple drives fail(about 30%) within a few days of each other, likely faster than the cluster could recover.

Hov did so many drives break?

Jesper

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com