Check your network, check what the difference is between the 2 osd's that started, check if /etc/ceph is readable. (and write normal plain text email) > -----Original Message----- > From: ·å <286204879@xxxxxx> > Sent: Tuesday, 3 August 2021 09:37 > To: ceph-users <ceph-users@xxxxxxx> > Subject: 100.000% pgs unknown > > Hello, my ceph cluster has crashed, please help me£¡ > > #ceph -s > cluster: > id: 5e1d9b55-9040-494a-9011- > d73469dabe1a > health: HEALTH_WARN > 1 filesystem is degraded > 1 MDSs report slow metadata > IOs > 27 osds down > 7 hosts (28 osds) down > 1 root (28 osds) down > Reduced data availability: > 1760 pgs inactive > > services: > mon: 3 daemons, quorum ceph-m-51,ceph-m-52,ceph-m-53 (age > 24h) > mgr: ceph-m-51(active, since 4h) > mds: cephfs:1/1 {0=ceph-m-51=up:replay} 2 up:standby > osd: 30 osds: 2 up (since 24h), 29 in (since 24h) > > data: > pools: 7 pools, 1760 pgs > objects: 0 objects, 0 B > usage: 0 B used, 0 B / 0 B avail > pgs: 100.000% pgs unknown > 1760 unknown > > > > #ceph osd tree > ID CLASS WEIGHT TYPE NAME > STATUS REWEIGHT PRI-AFF > -25 2.18320 root cache-ssd > > > -22 2.18320 host ceph- > node-47 > > 24 ssd 1.09160 > osd.24 up > 1.00000 1.00000 > 25 ssd 1.09160 > osd.25 up > 1.00000 1.00000 > -1 27.44934 root default > > > -28 3.58989 host ceph- > node-110 > > 26 hdd 0.90970 > osd.26 down 0.79999 > 1.00000 > 27 hdd 0.90970 > osd.27 down 0.79999 > 1.00000 > 28 hdd 0.90970 > osd.28 down 0.79999 > 1.00000 > 29 hdd 0.86079 > osd.29 down 0.79999 > 1.00000 > -3 14.55475 host > ceph-node-54 > > 0 hdd 3.63869 > osd.0 down > 0 1.00000 > 1 hdd 3.63869 > osd.1 down 1.00000 > 1.00000 > 2 hdd 3.63869 > osd.2 down 1.00000 > 1.00000 > 3 hdd 3.63869 > osd.3 down 1.00000 > 1.00000 > -5 1.76936 host > ceph-node-55 > > 4 hdd 0.45479 > osd.4 down 0.50000 > 1.00000 > 5 hdd 0.45479 > osd.5 down 0.50000 > 1.00000 > 6 hdd 0.45479 > osd.6 down 0.50000 > 1.00000 > 7 hdd 0.40500 > osd.7 down 0.50000 > 1.00000 > -7 2.22427 host > ceph-node-56 > > 8 hdd 0.90970 > osd.8 down 0.50000 > 1.00000 > 9 hdd 0.45479 > osd.9 down 0.50000 > 1.00000 > 10 hdd 0.45479 > osd.10 down 0.50000 > 1.00000 > 11 hdd 0.40500 > osd.11 down 0.50000 > 1.00000 > -9 1.77036 host > ceph-node-57 > > 12 hdd 0.45479 > osd.12 down 0.50000 > 1.00000 > 13 hdd 0.45479 > osd.13 down 0.50000 > 1.00000 > 14 hdd 0.45479 > osd.14 down 0.50000 > 1.00000 > 15 hdd 0.40599 > osd.15 down 0.50000 > 1.00000 > -11 1.77036 host ceph- > node-58 > > 16 hdd 0.45479 > osd.16 down 0.50000 > 1.00000 > 17 hdd 0.45479 > osd.17 down 0.50000 > 1.00000 > 18 hdd 0.45479 > osd.18 down 0.50000 > 1.00000 > 19 hdd 0.40599 > osd.19 down 0.50000 > 1.00000 > -13 1.77036 host ceph- > node-59 > > 20 hdd 0.45479 > osd.20 down 0.50000 > 1.00000 > 21 hdd 0.45479 > osd.21 down 0.50000 > 1.00000 > 22 hdd 0.45479 > osd.22 down 0.50000 > 1.00000 > 23 hdd 0.40599 > osd.23 down 0.50000 > 1.00000 > > > > #ceph fs dump > e110784 > enable_multiple, ever_enabled_multiple: 0,0 > compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable > ranges,3=default file layouts on dirs,4=dir inode in separate > object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no > anchor table,9=file layout v2,10=snaprealm v2} > legacy client fscid: 1 > > Filesystem 'cephfs' (1) > fs_name cephfs > epoch 110784 > flags 32 > created 2020-09-18 11:05:47.190096 > modified 2021-08-03 15:33:04.379969 > tableserver 0 > root 0 > session_timeout 60 > session_autoclose 300 > max_file_size 1099511627776 > min_compat_client -1 (unspecified) > last_failure 0 > last_failure_osd_epoch 18808 > compat compat={},rocompat={},incompat={1=base v0.20,2=client > writeable ranges,3=default file layouts on dirs,4=dir inode in separate > object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no > anchor table,9=file layout v2,10=snaprealm v2} > max_mds 1 > in 0 > up {0=554577} > failed > damaged > stopped > data_pools [13] > metadata_pool 14 > inline_data disabled > balancer > standby_count_wanted 1 > [mds.ceph-m-51{0:554577} state up:replay seq 37643 addr > [v2:192.168.221.51:6800/1640891715,v1:192.168.221.51:6801/1640891715]] > > > Standby daemons: > [mds.ceph-m-52{-1:554746} state up:standby seq 2 addr > [v2:192.168.221.52:6800/1652218221,v1:192.168.221.52:6801/1652218221]] > [mds.ceph-m-53{-1:554789} state up:standby seq 2 addr > [v2:192.168.221.53:6800/1236333507,v1:192.168.221.53:6801/1236333507]] > dumped fsmap epoch 110784 > > > > > > When I start osd, the error message£º¡° > 8Ô 03 15:05:16 ceph-node-54 ceph-osd[101387]: 2021-08-03 15:05:16.337 > 7f4ac0702700 -1 monclient(hunting): handle_auth_bad_method server > allowed_methods [2] but i only support [2] > 8Ô 03 15:05:16 ceph-node-54 ceph-osd[101387]: 2021-08-03 15:05:16.337 > 7f4abff01700 -1 monclient(hunting): handle_auth_bad_method server > allowed_methods [2] but i only support [2] > 8Ô 03 15:05:16 ceph-node-54 ceph-osd[101387]: failed to fetch mon > config (--no-mon-config to skip) > > ¡± > Every osd is like this£¡ > I exported the crush map and found no abnormalities. How can I fix it > next? > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx