Hi David,Apologies for the late response.NodeB is mon+client, nodeC is client:Cheph health details:HEALTH_ERR 819 pgs are stuck inactive for more than 300 seconds; 883 pgs degraded; 64 pgs stale; 819 pgs stuck inactive; 1064 pgs stuck unclean; 883 pgs undersized; 22 requests are blocked > 32 sec; 3 osds have slow requests; recovery 2/8 objects degraded (25.000%); recovery 2/8 objects misplaced (25.000%); crush map has legacy tunables (require argonaut, min is firefly); crush map has straw_calc_version=0pg 2.fc is stuck inactive since forever, current state undersized+degraded+peered, last acting [2]pg 2.fd is stuck inactive since forever, current state undersized+degraded+peered, last acting [0]pg 2.fe is stuck inactive since forever, current state undersized+degraded+peered, last acting [2]pg 2.ff is stuck inactive since forever, current state undersized+degraded+peered, last acting [1]pg 1.fb is stuck inactive for 493857.572982, current state undersized+degraded+peered, last acting [4]pg 2.f8 is stuck inactive since forever, current state undersized+degraded+peered, last acting [3]pg 1.fa is stuck inactive for 492185.443146, current state undersized+degraded+peered, last acting [0]pg 2.f9 is stuck inactive since forever, current state undersized+degraded+peered, last acting [0]pg 1.f9 is stuck inactive for 492185.452890, current state undersized+degraded+peered, last acting [2]pg 2.fa is stuck inactive since forever, current state undersized+degraded+peered, last acting [3]pg 1.f8 is stuck inactive for 492185.443324, current state undersized+degraded+peered, last acting [0]pg 2.fb is stuck inactive since forever, current state undersized+degraded+peered, last acting [2]...pg 1.fb is undersized+degraded+peered, acting [4]pg 2.ff is undersized+degraded+peered, acting [1]pg 2.fe is undersized+degraded+peered, acting [2]pg 2.fd is undersized+degraded+peered, acting [0]pg 2.fc is undersized+degraded+peered, acting [2]3 ops are blocked > 536871 sec on osd.415 ops are blocked > 268435 sec on osd.41 ops are blocked > 262.144 sec on osd.42 ops are blocked > 268435 sec on osd.31 ops are blocked > 268435 sec on osd.13 osds have slow requestsrecovery 2/8 objects degraded (25.000%)recovery 2/8 objects misplaced (25.000%)crush map has legacy tunables (require argonaut, min is firefly); see http://ceph.com/docs/master/rados/operations/crush-map/#tunablescrush map has straw_calc_version=0; see http://ceph.com/docs/master/rados/operations/crush-map/#tunablesceph osd statcluster-admin@nodeB:~/.ssh/ceph-cluster$ cat ceph_osd_stat.txtosdmap e80: 10 osds: 5 up, 5 in; 558 remapped pgsflags sortbitwiseceph osd tree:cluster-admin@nodeB:~/.ssh/ceph-cluster$ ceph osd treeID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 9.08691 root default-2 4.54346 host nodeB5 0.90869 osd.5 down 0 1.000006 0.90869 osd.6 down 0 1.000007 0.90869 osd.7 down 0 1.000008 0.90869 osd.8 down 0 1.000009 0.90869 osd.9 down 0 1.00000-3 4.54346 host nodeC0 0.90869 osd.0 up 1.00000 1.000001 0.90869 osd.1 up 1.00000 1.000002 0.90869 osd.2 up 1.00000 1.000003 0.90869 osd.3 up 1.00000 1.000004 0.90869 osd.4 up 1.00000 1.00000CrushMap:# begin crush map# devicesdevice 0 osd.0device 1 osd.1device 2 osd.2device 3 osd.3device 4 osd.4device 5 osd.5device 6 osd.6device 7 osd.7device 8 osd.8device 9 osd.9# typestype 0 osdtype 1 hosttype 2 chassistype 3 racktype 4 rowtype 5 pdutype 6 podtype 7 roomtype 8 datacentertype 9 regiontype 10 root# bucketshost nodeB {id -2 # do not change unnecessarily# weight 4.543alg strawhash 0 # rjenkins1item osd.5 weight 0.909item osd.6 weight 0.909item osd.7 weight 0.909item osd.8 weight 0.909item osd.9 weight 0.909}host nodeC {id -3 # do not change unnecessarily# weight 4.543alg strawhash 0 # rjenkins1item osd.0 weight 0.909item osd.1 weight 0.909item osd.2 weight 0.909item osd.3 weight 0.909item osd.4 weight 0.909}root default {id -1 # do not change unnecessarily# weight 9.087alg strawhash 0 # rjenkins1item nodeB weight 4.543item nodeC weight 4.543}# rulesrule replicated_ruleset {ruleset 0type replicatedmin_size 1max_size 10step take defaultstep chooseleaf firstn 0 type hoststep emit}# end crush mapceph.confcluster-admin@nodeB:~/.ssh/ceph-cluster$ cat /etc/ceph/ceph.conf[global]fsid = a04e9846-6c54-48ee-b26f-d6949d8bacb4mon_initial_members = nodeBmon_host = <mon IP>auth_cluster_required = cephxauth_service_required = cephxauth_client_required = cephxpublic_network = X.X.X.0/24On Sat, Jun 18, 2016 at 12:15 PM, David <dclistslinux@xxxxxxxxx> wrote:Is this a test cluster that has never been healthy or a working cluster which has just gone unhealthy? Have you changed anything? Are all hosts, drives, network links working? More detail please. Any/all of the following would help:
ceph health detail
ceph osd stat
ceph osd tree
Your ceph.conf
Your crushmapOn 17 Jun 2016 14:14, "Ishmael Tsoaela" <ishmaelt3@xxxxxxxxx> wrote:
>
> Hi All,
>
> please assist to fix the error:
>
> 1 X admin
> 2 X admin(hosting admin as well)
>
> 4 osd each nodePlease provide more detail, this suggests you should have 12 osd's but your osd map shows 10 osd's, 5 of which are down.
>
>
> cluster a04e9846-6c54-48ee-b26f-d6949d8bacb4
> health HEALTH_ERR
> 819 pgs are stuck inactive for more than 300 seconds
> 883 pgs degraded
> 64 pgs stale
> 819 pgs stuck inactive
> 245 pgs stuck unclean
> 883 pgs undersized
> 17 requests are blocked > 32 sec
> recovery 2/8 objects degraded (25.000%)
> recovery 2/8 objects misplaced (25.000%)
> crush map has legacy tunables (require argonaut, min is firefly)
> crush map has straw_calc_version=0
> monmap e1: 1 mons at {nodeB=155.232.195.4:6789/0}
> election epoch 7, quorum 0 nodeB
> osdmap e80: 10 osds: 5 up, 5 in; 558 remapped pgs
> flags sortbitwise
> pgmap v480: 1064 pgs, 3 pools, 6454 bytes data, 4 objects
> 25791 MB used, 4627 GB / 4652 GB avail
> 2/8 objects degraded (25.000%)
> 2/8 objects misplaced (25.000%)
> 819 undersized+degraded+peered
> 181 active
> 64 stale+active+undersized+degraded
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Best regards, 施柏安 Desmond Shih 技術研發部 Technical Development | ||
迎棧科技股份有限公司 | ||
│ | 886-975-857-982 | |
│ | desmond.s@inwinstack.com | |
│ | 886-2-7738-2858 #7725 | |
│ | 新北市220板橋區遠東路3號5樓C室 | |
Rm.C, 5F., No.3, Yuandong Rd., | ||
Banqiao Dist., New Taipei City 220, Taiwan (R.O.C) | ||
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com