Cluster problem - Quncy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello.



I have a few issues with my ceph cluster:



- RGWs have disappeared from management (console does not register any
RGWs) despite showing 4 services deployed and processes running;

- All object buckets not accessible / manageable;

- Console showing some of my pools are “updating” – its been like this for
a few days;



What was done:



- Expanded a 30 OSD / 3 node cluster to 60 OSDs across 6 nodes;

- Changed the failure domain of our crush rules from OSD to host;

- Increased PG and PGP of main data pools to reflect additional cluster
OSDs and maintain a ~ 100 PG / OSD target;

- Attempted to redeploy / re-create service for RGWs / remove SSL config;

- A single OSD with very high ECC errors was found and preemptively removed
from the cluster.



The cluster took a few days rebalancing itself, and impression was it would
be done by now.   It’s not healthy, and as per above, RGWs are no longer
manageable – not sure where to start troubleshooting this as I’ve never
encountered such a scenario before.



Cluster specs:

-       6 OSD nodes (10 OSDs each);

-       5 Monitors;

-       2 Managers;

-       5 MDS;

-       4 RGWs;

-       Quincy 17.2.5;

-
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux