On 10/30/19 3:04 AM, soumya tr wrote: > Hi all, > > I have a 3 node ceph cluster setup using juju charms. ceph health shows > having inactive pgs. > > --------------- > /# ceph status > cluster: > id: 0e36956e-ef64-11e9-b472-00163e6e01e8 > health: HEALTH_WARN > Reduced data availability: 114 pgs inactive > > services: > mon: 3 daemons, quorum > juju-06c3e9-0-lxd-0,juju-06c3e9-2-lxd-0,juju-06c3e9-1-lxd-0 > mgr: juju-06c3e9-0-lxd-0(active), standbys: juju-06c3e9-1-lxd-0, > juju-06c3e9-2-lxd-0 > osd: 3 osds: 3 up, 3 in > > data: > pools: 18 pools, 114 pgs > objects: 0 objects, 0 B > usage: 3.0 GiB used, 34 TiB / 34 TiB avail > pgs: 100.000% pgs unknown > 114 unknown/ > --------------- > > *PG health as well shows the PGs are in inactive state* > > ------------------------------- > /# ceph health detail > HEALTH_WARN Reduced data availability: 114 pgs inactive > PG_AVAILABILITY Reduced data availability: 114 pgs inactive > pg 1.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.2 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.3 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.4 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.5 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.6 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.7 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.8 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.9 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 1.a is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 2.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 2.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 3.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 3.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 4.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 4.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 5.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 5.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 6.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 6.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 7.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 7.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 8.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 8.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 9.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 9.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 10.1 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 11.0 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.10 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.11 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.12 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.13 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.14 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.15 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.16 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.17 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.18 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.19 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 17.1a is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.10 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.11 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.12 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.13 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.14 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.15 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.16 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.17 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.19 is stuck inactive for 1454.593774, current state unknown, > last acting [] > pg 18.1a is stuck inactive for 1454.593774, current state unknown, > last acting [] > / > / pg 18.1b is stuck inactive for 1454.593774, current state unknown, > last acting []/ > -------------------------------- > > But the weird thing is when I query for individual pg, its unable to > find it :( > > -------------------------------- > /# ceph pg 1.1 query > Error ENOENT: i don't have pgid 1.1 > / > / > / > /# ceph pg 18.1a query > Error ENOENT: i don't have pgid 18.1a > / > / > / > /# ceph pg 18.1b query > Error ENOENT: i don't have pgid 18.1b/ > -------------------------------- > > As per https://docs.ceph.com/docs/master/rados/operations/pg-states/, ; > > --------------------------------- > /unknown : /The ceph-mgr hasn’t yet received any information about the > PG’s state from an OSD since mgr started up. > --------------------------------- > > I confirmed that all ceph osds are up, and the ceph-mgr service is as > well running. > Did you restart the Mgr? And are there maybe firewalls in between which might be causing troubles? This seems like a Mgr issue. Wido > Is there anything else that I need to check to rectify the issue? > > > -- > Regards, > Soumya > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com