You might also check out "ceph osd tree" and crush dump and make sure they look the way you expect. On Mon, Jan 30, 2017 at 1:23 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > On Sun, Jan 29, 2017 at 6:40 AM, Muthusamy Muthiah > <muthiah.muthusamy@xxxxxxxxx> wrote: >> Hi All, >> >> Also tried EC profile 3+1 on 5 node cluster with bluestore enabled . When >> an OSD is down the cluster goes to ERROR state even when the cluster is n+1 >> . No recovery happening. >> >> health HEALTH_ERR >> 75 pgs are stuck inactive for more than 300 seconds >> 75 pgs incomplete >> 75 pgs stuck inactive >> 75 pgs stuck unclean >> monmap e2: 5 mons at >> {ca-cn1=10.50.5.117:6789/0,ca-cn2=10.50.5.118:6789/0,ca-cn3=10.50.5.119:6789/0,ca-cn4=10.50.5.120:6789/0,ca-cn5=10.50.5.121:6789/0} >> election epoch 10, quorum 0,1,2,3,4 >> ca-cn1,ca-cn2,ca-cn3,ca-cn4,ca-cn5 >> mgr active: ca-cn1 standbys: ca-cn4, ca-cn3, ca-cn5, ca-cn2 >> osdmap e264: 60 osds: 59 up, 59 in; 75 remapped pgs >> flags sortbitwise,require_jewel_osds,require_kraken_osds >> pgmap v119402: 1024 pgs, 1 pools, 28519 GB data, 21548 kobjects >> 39976 GB used, 282 TB / 322 TB avail >> 941 active+clean >> 75 remapped+incomplete >> 8 active+clean+scrubbing >> >> this seems to be an issue with bluestore , recovery not happening properly >> with EC . > > It's possible but it seems a lot more likely this is some kind of > config issue. Can you share your osd map ("ceph osd getmap")? > -Greg > >> >> Thanks, >> Muthu >> >> On 24 January 2017 at 12:57, Muthusamy Muthiah <muthiah.muthusamy@xxxxxxxxx> >> wrote: >>> >>> Hi Greg, >>> >>> We use EC:4+1 on 5 node cluster in production deployments with filestore >>> and it does recovery and peering when one OSD goes down. After few mins , >>> other OSD from a node where the fault OSD exists will take over the PGs >>> temporarily and all PGs goes to active + clean state . Cluster also does not >>> goes down during this recovery process. >>> >>> Only on bluestore we see cluster going to error state when one OSD is >>> down. >>> We are still validating this and let you know additional findings. >>> >>> Thanks, >>> Muthu >>> >>> On 21 January 2017 at 02:06, Shinobu Kinjo <skinjo@xxxxxxxxxx> wrote: >>>> >>>> `ceph pg dump` should show you something like: >>>> >>>> * active+undersized+degraded ... [NONE,3,2,4,1] 3 [NONE,3,2,4,1] >>>> >>>> Sam, >>>> >>>> Am I wrong? Or is it up to something else? >>>> >>>> >>>> On Sat, Jan 21, 2017 at 4:22 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> >>>> wrote: >>>> > I'm pretty sure the default configs won't let an EC PG go active with >>>> > only "k" OSDs in its PG; it needs at least k+1 (or possibly more? Not >>>> > certain). Running an "n+1" EC config is just not a good idea. >>>> > For testing you could probably adjust this with the equivalent of >>>> > min_size for EC pools, but I don't know the parameters off the top of >>>> > my head. >>>> > -Greg >>>> > >>>> > On Fri, Jan 20, 2017 at 2:15 AM, Muthusamy Muthiah >>>> > <muthiah.muthusamy@xxxxxxxxx> wrote: >>>> >> Hi , >>>> >> >>>> >> We are validating kraken 11.2.0 with bluestore on 5 node cluster with >>>> >> EC >>>> >> 4+1. >>>> >> >>>> >> When an OSD is down , the peering is not happening and ceph health >>>> >> status >>>> >> moved to ERR state after few mins. This was working in previous >>>> >> development >>>> >> releases. Any additional configuration required in v11.2.0 >>>> >> >>>> >> Following is our ceph configuration: >>>> >> >>>> >> mon_osd_down_out_interval = 30 >>>> >> mon_osd_report_timeout = 30 >>>> >> mon_osd_down_out_subtree_limit = host >>>> >> mon_osd_reporter_subtree_level = host >>>> >> >>>> >> and the recovery parameters set to default. >>>> >> >>>> >> [root@ca-cn1 ceph]# ceph osd crush show-tunables >>>> >> >>>> >> { >>>> >> "choose_local_tries": 0, >>>> >> "choose_local_fallback_tries": 0, >>>> >> "choose_total_tries": 50, >>>> >> "chooseleaf_descend_once": 1, >>>> >> "chooseleaf_vary_r": 1, >>>> >> "chooseleaf_stable": 1, >>>> >> "straw_calc_version": 1, >>>> >> "allowed_bucket_algs": 54, >>>> >> "profile": "jewel", >>>> >> "optimal_tunables": 1, >>>> >> "legacy_tunables": 0, >>>> >> "minimum_required_version": "jewel", >>>> >> "require_feature_tunables": 1, >>>> >> "require_feature_tunables2": 1, >>>> >> "has_v2_rules": 1, >>>> >> "require_feature_tunables3": 1, >>>> >> "has_v3_rules": 0, >>>> >> "has_v4_buckets": 0, >>>> >> "require_feature_tunables5": 1, >>>> >> "has_v5_rules": 0 >>>> >> } >>>> >> >>>> >> ceph status: >>>> >> >>>> >> health HEALTH_ERR >>>> >> 173 pgs are stuck inactive for more than 300 seconds >>>> >> 173 pgs incomplete >>>> >> 173 pgs stuck inactive >>>> >> 173 pgs stuck unclean >>>> >> monmap e2: 5 mons at >>>> >> >>>> >> {ca-cn1=10.50.5.117:6789/0,ca-cn2=10.50.5.118:6789/0,ca-cn3=10.50.5.119:6789/0,ca-cn4=10.50.5.120:6789/0,ca-cn5=10.50.5.121:6789/0} >>>> >> election epoch 106, quorum 0,1,2,3,4 >>>> >> ca-cn1,ca-cn2,ca-cn3,ca-cn4,ca-cn5 >>>> >> mgr active: ca-cn1 standbys: ca-cn2, ca-cn4, ca-cn5, ca-cn3 >>>> >> osdmap e1128: 60 osds: 59 up, 59 in; 173 remapped pgs >>>> >> flags sortbitwise,require_jewel_osds,require_kraken_osds >>>> >> pgmap v782747: 2048 pgs, 1 pools, 63133 GB data, 46293 kobjects >>>> >> 85199 GB used, 238 TB / 322 TB avail >>>> >> 1868 active+clean >>>> >> 173 remapped+incomplete >>>> >> 7 active+clean+scrubbing >>>> >> >>>> >> MON log: >>>> >> >>>> >> 2017-01-20 09:25:54.715684 7f55bcafb700 0 log_channel(cluster) log >>>> >> [INF] : >>>> >> osd.54 out (down for 31.703786) >>>> >> 2017-01-20 09:25:54.725688 7f55bf4d5700 0 mon.ca-cn1@0(leader).osd >>>> >> e1120 >>>> >> crush map has features 288250512065953792, adjusting msgr requires >>>> >> 2017-01-20 09:25:54.729019 7f55bf4d5700 0 log_channel(cluster) log >>>> >> [INF] : >>>> >> osdmap e1120: 60 osds: 59 up, 59 in >>>> >> 2017-01-20 09:25:54.735987 7f55bf4d5700 0 log_channel(cluster) log >>>> >> [INF] : >>>> >> pgmap v781993: 2048 pgs: 1869 active+clean, 173 incomplete, 6 >>>> >> active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB / 322 TB >>>> >> avail; >>>> >> 21825 B/s rd, 163 MB/s wr, 2046 op/s >>>> >> 2017-01-20 09:25:55.737749 7f55bf4d5700 0 mon.ca-cn1@0(leader).osd >>>> >> e1121 >>>> >> crush map has features 288250512065953792, adjusting msgr requires >>>> >> 2017-01-20 09:25:55.744338 7f55bf4d5700 0 log_channel(cluster) log >>>> >> [INF] : >>>> >> osdmap e1121: 60 osds: 59 up, 59 in >>>> >> 2017-01-20 09:25:55.749616 7f55bf4d5700 0 log_channel(cluster) log >>>> >> [INF] : >>>> >> pgmap v781994: 2048 pgs: 29 remapped+incomplete, 1869 active+clean, >>>> >> 144 >>>> >> incomplete, 6 active+clean+scrubbing; 63159 GB data, 85201 GB used, >>>> >> 238 TB / >>>> >> 322 TB avail; 44503 B/s rd, 45681 kB/s wr, 518 op/s >>>> >> 2017-01-20 09:25:56.768721 7f55bf4d5700 0 log_channel(cluster) log >>>> >> [INF] : >>>> >> pgmap v781995: 2048 pgs: 47 remapped+incomplete, 1869 active+clean, >>>> >> 126 >>>> >> incomplete, 6 active+clean+scrubbing; 63159 GB data, 85201 GB used, >>>> >> 238 TB / >>>> >> 322 TB avail; 20275 B/s rd, 72742 kB/s wr, 665 op/s >>>> >> >>>> >> Thanks, >>>> >> Muthu >>>> >> >>>> >> >>>> >> _______________________________________________ >>>> >> ceph-users mailing list >>>> >> ceph-users@xxxxxxxxxxxxxx >>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >> >>>> > _______________________________________________ >>>> > ceph-users mailing list >>>> > ceph-users@xxxxxxxxxxxxxx >>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com