Re: Brand new cluster -- pg is stuck inactive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



strange that no osd is acting for your pg's
can you show the output from
ceph osd tree


mvh
Ronny Aasen



On 13.10.2017 18:53, dE wrote:
Hi,

    I'm running ceph 10.2.5 on Debian (official package).

It cant seem to create any functional pools --

ceph health detail
HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive; too few PGs per OSD (21 < min 30) pg 0.39 is stuck inactive for 652.741684, current state creating, last acting [] pg 0.38 is stuck inactive for 652.741688, current state creating, last acting [] pg 0.37 is stuck inactive for 652.741690, current state creating, last acting [] pg 0.36 is stuck inactive for 652.741692, current state creating, last acting [] pg 0.35 is stuck inactive for 652.741694, current state creating, last acting [] pg 0.34 is stuck inactive for 652.741696, current state creating, last acting [] pg 0.33 is stuck inactive for 652.741698, current state creating, last acting [] pg 0.32 is stuck inactive for 652.741701, current state creating, last acting [] pg 0.3 is stuck inactive for 652.741762, current state creating, last acting [] pg 0.2e is stuck inactive for 652.741715, current state creating, last acting [] pg 0.2d is stuck inactive for 652.741719, current state creating, last acting [] pg 0.2c is stuck inactive for 652.741721, current state creating, last acting [] pg 0.2b is stuck inactive for 652.741723, current state creating, last acting [] pg 0.2a is stuck inactive for 652.741725, current state creating, last acting [] pg 0.29 is stuck inactive for 652.741727, current state creating, last acting [] pg 0.28 is stuck inactive for 652.741730, current state creating, last acting [] pg 0.27 is stuck inactive for 652.741732, current state creating, last acting [] pg 0.26 is stuck inactive for 652.741734, current state creating, last acting [] pg 0.3e is stuck inactive for 652.741707, current state creating, last acting [] pg 0.f is stuck inactive for 652.741761, current state creating, last acting [] pg 0.3f is stuck inactive for 652.741708, current state creating, last acting [] pg 0.10 is stuck inactive for 652.741763, current state creating, last acting [] pg 0.4 is stuck inactive for 652.741773, current state creating, last acting [] pg 0.5 is stuck inactive for 652.741774, current state creating, last acting [] pg 0.3a is stuck inactive for 652.741717, current state creating, last acting [] pg 0.b is stuck inactive for 652.741771, current state creating, last acting [] pg 0.c is stuck inactive for 652.741772, current state creating, last acting [] pg 0.3b is stuck inactive for 652.741721, current state creating, last acting [] pg 0.d is stuck inactive for 652.741774, current state creating, last acting [] pg 0.3c is stuck inactive for 652.741722, current state creating, last acting [] pg 0.e is stuck inactive for 652.741776, current state creating, last acting [] pg 0.3d is stuck inactive for 652.741724, current state creating, last acting [] pg 0.22 is stuck inactive for 652.741756, current state creating, last acting [] pg 0.21 is stuck inactive for 652.741758, current state creating, last acting [] pg 0.a is stuck inactive for 652.741783, current state creating, last acting [] pg 0.20 is stuck inactive for 652.741761, current state creating, last acting [] pg 0.9 is stuck inactive for 652.741787, current state creating, last acting [] pg 0.1f is stuck inactive for 652.741764, current state creating, last acting [] pg 0.8 is stuck inactive for 652.741790, current state creating, last acting [] pg 0.7 is stuck inactive for 652.741792, current state creating, last acting [] pg 0.6 is stuck inactive for 652.741794, current state creating, last acting [] pg 0.1e is stuck inactive for 652.741770, current state creating, last acting [] pg 0.1d is stuck inactive for 652.741772, current state creating, last acting [] pg 0.1c is stuck inactive for 652.741774, current state creating, last acting [] pg 0.1b is stuck inactive for 652.741777, current state creating, last acting [] pg 0.1a is stuck inactive for 652.741784, current state creating, last acting [] pg 0.2 is stuck inactive for 652.741812, current state creating, last acting [] pg 0.31 is stuck inactive for 652.741762, current state creating, last acting [] pg 0.19 is stuck inactive for 652.741789, current state creating, last acting [] pg 0.11 is stuck inactive for 652.741797, current state creating, last acting [] pg 0.18 is stuck inactive for 652.741793, current state creating, last acting [] pg 0.1 is stuck inactive for 652.741820, current state creating, last acting [] pg 0.30 is stuck inactive for 652.741769, current state creating, last acting [] pg 0.17 is stuck inactive for 652.741797, current state creating, last acting [] pg 0.0 is stuck inactive for 652.741829, current state creating, last acting [] pg 0.2f is stuck inactive for 652.741774, current state creating, last acting [] pg 0.16 is stuck inactive for 652.741802, current state creating, last acting [] pg 0.12 is stuck inactive for 652.741807, current state creating, last acting [] pg 0.13 is stuck inactive for 652.741807, current state creating, last acting [] pg 0.14 is stuck inactive for 652.741807, current state creating, last acting [] pg 0.15 is stuck inactive for 652.741808, current state creating, last acting [] pg 0.23 is stuck inactive for 652.741792, current state creating, last acting [] pg 0.24 is stuck inactive for 652.741793, current state creating, last acting [] pg 0.25 is stuck inactive for 652.741793, current state creating, last acting []

I got 3 OSDs --

ceph osd stat
     osdmap e8: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds

ceph osd pool ls detail
pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0

The state inactive seems to be odd for a brand new pool with no data.

This's my ceph.conf --

[global]
fsid = 8161c91e-dbd2-4491-adf8-74446bef916a
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
debug = 10/10
mon host = 10.242.103.139:8567,10.242.103.140:8567,10.242.103.141:8567
[mon]
ms bind ipv6 = false
mon data = /srv/ceph/mon
mon addr = 0.0.0.0:8567
mon warn on legacy crush tunables = true
mon crush min required version = jewel
mon initial members = 0,1,2
keyring = /etc/ceph/mon_keyring
log file = /var/log/ceph/mon.log
[osd]
osd data = /srv/ceph/osd
osd journal = /srv/ceph/osd/osd_journal
osd journal size = 10240
osd recovery delay start = 10
osd recovery thread timeout = 60
osd recovery max active = 1
osd recovery max chunk = 10485760
osd max backfills = 2
osd backfill retry interval = 60
osd backfill scan min = 100
osd backfill scan max = 1000
keyring = /etc/ceph/osd_keyring

The monitors run on the same host as osds.

Any help will be appreciated highly!

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux