If you added OSDs and then deleted them repeatedly without waiting for replication to finish as the cluster attempted to re-balance across them, its highly likely that you are permanently missing PGs (especially if the disks were zapped each time).
If those 3 down OSDs can be revived there is a (small) chance that you can right the ship, but 1400pg/OSD is pretty extreme. I'm surprised the cluster even let you do that - this sounds like a data loss event.
Bring back the 3 OSD and see what those 2 inconsistent pgs look like with ceph pg query.
On January 3, 2019 21:59:38 Arun POONIA <arun.poonia@xxxxxxxxxxxxxxxxx> wrote:
Hi,Recently I tried adding a new node (OSD) to ceph cluster using ceph-deploy tool. Since I was experimenting with tool and ended up deleting OSD nodes on new server couple of times.Now since ceph OSDs are running on new server cluster PGs seems to be inactive (10-15%) and they are not recovering or rebalancing. Not sure what to do. I tried shutting down OSDs on new server.Status:[root@fre105 ~]# ceph -s2019-01-03 18:56:42.867081 7fa0bf573700 -1 asok(0x7fa0b80017a0) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph-guests/ceph-client.admin.4018644.140328258509136.asok': (2) No such file or directorycluster:id: adb9ad8e-f458-4124-bf58-7963a8d1391fhealth: HEALTH_ERR3 pools have many more objects per pg than average373907/12391198 objects misplaced (3.018%)2 scrub errors9677 PGs pending on creationReduced data availability: 7145 pgs inactive, 6228 pgs down, 1 pg peering, 2717 pgs stalePossible data damage: 2 pgs inconsistentDegraded data redundancy: 178350/12391198 objects degraded (1.439%), 346 pgs degraded, 1297 pgs undersized52486 slow requests are blocked > 32 sec9287 stuck requests are blocked > 4096 sectoo many PGs per OSD (2968 > max 200)services:mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03mgr: ceph-mon03(active), standbys: ceph-mon01, ceph-mon02osd: 39 osds: 36 up, 36 in; 51 remapped pgsrgw: 1 daemon activedata:pools: 18 pools, 54656 pgsobjects: 6050k objects, 10941 GBusage: 21727 GB used, 45308 GB / 67035 GB availpgs: 13.073% pgs not active178350/12391198 objects degraded (1.439%)373907/12391198 objects misplaced (3.018%)46177 active+clean5054 down1173 stale+down1084 stale+active+undersized547 activating201 stale+active+undersized+degraded158 stale+activating96 activating+degraded46 stale+active+clean42 activating+remapped34 stale+activating+degraded23 stale+activating+remapped6 stale+activating+undersized+degraded+remapped6 activating+undersized+degraded+remapped2 activating+degraded+remapped2 active+clean+inconsistent1 stale+activating+degraded+remapped1 stale+active+clean+remapped1 stale+remapped1 down+remapped1 remapped+peeringio:client: 0 B/s rd, 208 kB/s wr, 28 op/s rd, 28 op/s wrThanks--Arun Poonia_______________________________________________ceph-users mailing list
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com