Re: [Urgent] Ceph system Down, Ceph FS volume in recovering

"Matthew Leonard (BLOOMBERG/ 120 PARK)" <mleonard33@xxxxxxxxxxxxx> · Sat, 24 Feb 2024 16:01:33 -0000

Anthony is correct, this is what I was getting at as well when seeing your ceph -s output. More details in the Ceph docs here if you want to understand the details of why you need to balance your nodes. 

https://docs.ceph.com/en/quincy/rados/operations/monitoring-osd-pg/

But you need to get your core rados cluster healthy, then you can pivot to trying to bringing back your gateway services. 
Sent from Bloomberg Professional for iPhone

----- Original Message -----
From: Anthony D'Atri <anthony.datri@xxxxxxxxx>
To: nguyenvandiep@xxxxxxxxxxxxxx
CC: ceph-users@xxxxxxx
At: 02/24/24 15:46:24 UTC

Your recovery is stuck because there are no OSDs that have enough space to accept data.

Your second OSD host appears to only have 9 OSDs currently, so you should be able to add a 10TB OSD there without removing anything.

That will enable data to move to all three of your 10TB OSDs.

> On Feb 24, 2024, at 10:41 AM, nguyenvandiep@xxxxxxxxxxxxxx wrote:
>
> HolySh***
>
> First, we change the mon_max_pg_per_osd to 1000
>
> About adding disk for cephosd02, for more detail , what is TO, sir ? I ll make conversation with my boss. To be honest, im thinking that the volume recovery progress will get problem...
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx