Hi Joao, In the meanwhile I have done the following things : $ ceph osd crush move ceph-osd15 rack=rack1-pdu1 moved item id -17 name 'ceph-osd15' to location {rack=rack1-pdu1} in crush map $ ceph osd crush rm rack2-pdu3 removed item id -23 name 'rack2-pdu3' from crush map But it does not solve the problem either. I saw in the documentation that restarting the osd where the PG are stuck could help... I did restart all the OSD but it leads to the following status : cluster 4a8669b9-b379-43b2-9488-7fca6e1366bc health HEALTH_WARN 80 pgs degraded; 152 pgs peering; 411 pgs stale; 166 pgs stuck inactive; 411 pgs stuck stale; 620 pgs stuck unclean; recovery 51106/694410 objects degraded (7.360%) monmap e2: 3 mons at {ceph-mon0=10.1.2.1:6789/0,ceph-mon1=10.1.2.2:6789/0,ceph-mon2=10.1.2.3:6789/0}, election epoch 68, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2 osdmap e1825: 16 osds: 16 up, 16 in pgmap v301798: 712 pgs, 5 pools, 1350 GB data, 338 kobjects 2763 GB used, 5615 GB / 8379 GB avail 51106/694410 objects degraded (7.360%) 152 stale+peering 73 stale+active+remapped 80 stale+active+degraded+remapped 92 stale+active+clean 301 active+remapped 14 stale You'll find my crush map here : http://pastebin.com/F9aFjcjm Cheers, Olivier. ----- Mail original ----- > De: "Joao Eduardo Luis" <joao.luis at inktank.com> > ?: "Olivier DELHOMME" <olivier.delhomme at mines-paristech.fr>, ceph-users at lists.ceph.com > Envoy?: Mercredi 23 Juillet 2014 19:39:52 > Objet: Re: [ceph-users] MON segfaulting when setting a crush ruleset to a pool (firefly 0.80.4) > > Hey Olivier, > > On 07/23/2014 02:06 PM, Olivier DELHOMME wrote: > > Hello, > > > > I'm running a test cluster (mon and osd are debian 7 > > with 3.2.57-3+deb7u2 kernel). The client is a debian 7 > > with a 3.15.4 kernel that I compiled myself. > > > > The cluster has 3 monitors and 16 osd servers. > > I created a pool (periph) and used it a bit and then > > I decided to create some buckets and moved the hosts > > into : > > Can you share your crush map? > > Cheers! > > -Joao > > > > > > $ ceph osd crush add-bucket rack1-pdu1 rack > > $ ceph osd crush add-bucket rack1-pdu2 rack > > $ ceph osd crush add-bucket rack1-pdu3 rack > > $ ceph osd crush add-bucket rack2-pdu1 rack > > $ ceph osd crush add-bucket rack2-pdu2 rack > > $ ceph osd crush add-bucket rack2-pdu3 rack > > $ ceph osd crush move ceph-osd0 rack=rack1-pdu1 > > $ ceph osd crush move ceph-osd1 rack=rack1-pdu1 > > $ ceph osd crush move ceph-osd2 rack=rack1-pdu1 > > $ ceph osd crush move ceph-osd3 rack=rack1-pdu2 > > $ ceph osd crush move ceph-osd4 rack=rack1-pdu2 > > $ ceph osd crush move ceph-osd5 rack=rack1-pdu2 > > $ ceph osd crush move ceph-osd6 rack=rack1-pdu3 > > $ ceph osd crush move ceph-osd7 rack=rack1-pdu3 > > $ ceph osd crush move ceph-osd8 rack=rack1-pdu3 > > $ ceph osd crush move ceph-osd9 rack=rack2-pdu1 > > $ ceph osd crush move ceph-osd10 rack=rack2-pdu1 > > $ ceph osd crush move ceph-osd11 rack=rack2-pdu1 > > $ ceph osd crush move ceph-osd12 rack=rack2-pdu2 > > $ ceph osd crush move ceph-osd13 rack=rack2-pdu2 > > $ ceph osd crush move ceph-osd14 rack=rack2-pdu2 > > $ ceph osd crush move ceph-osd15 rack=rack2-pdu3 > > > > It did well : > > > > $ ceph osd tree > > # id weight type name up/down reweight > > -23 0.91 rack rack2-pdu3 > > -17 0.91 host ceph-osd15 > > 15 0.91 osd.15 up 1 > > -22 1.81 rack rack2-pdu2 > > -14 0.45 host ceph-osd12 > > 12 0.45 osd.12 up 1 > > -15 0.45 host ceph-osd13 > > 13 0.45 osd.13 up 1 > > -16 0.91 host ceph-osd14 > > 14 0.91 osd.14 up 1 > > -21 1.35 rack rack2-pdu1 > > -11 0.45 host ceph-osd9 > > 9 0.45 osd.9 up 1 > > -12 0.45 host ceph-osd10 > > 10 0.45 osd.10 up 1 > > -13 0.45 host ceph-osd11 > > 11 0.45 osd.11 up 1 > > -20 1.35 rack rack1-pdu3 > > -8 0.45 host ceph-osd6 > > 6 0.45 osd.6 up 1 > > -9 0.45 host ceph-osd7 > > 7 0.45 osd.7 up 1 > > -10 0.45 host ceph-osd8 > > 8 0.45 osd.8 up 1 > > -19 1.35 rack rack1-pdu2 > > -5 0.45 host ceph-osd3 > > 3 0.45 osd.3 up 1 > > -6 0.45 host ceph-osd4 > > 4 0.45 osd.4 up 1 > > -7 0.45 host ceph-osd5 > > 5 0.45 osd.5 up 1 > > -18 1.35 rack rack1-pdu1 > > -2 0.45 host ceph-osd0 > > 0 0.45 osd.0 up 1 > > -3 0.45 host ceph-osd1 > > 1 0.45 osd.1 up 1 > > -4 0.45 host ceph-osd2 > > 2 0.45 osd.2 up 1 > > -1 0 root default > > > > > > But then, when trying to set the crush_ruleset to the > > pool with the command below it crashes two of the three > > monitor. > > > > > > $ ceph osd pool set periph crush_ruleset 2 > > 2014-07-23 14:43:38.942811 7fa9696a3700 0 monclient: hunting for new mon > > > > The first monitor come to : > > > > -4> 2014-07-23 14:43:37.476121 7f52d2f46700 1 -- 10.1.2.1:6789/0 --> > > 10.1.2.100:0/1027991 -- mon_command_ack([{"prefix": > > "get_command_descriptions"}]=0 v0) v1 -- ?+29681 0x3b1c780 con > > 0x2a578c0 > > -3> 2014-07-23 14:43:37.598549 7f52d2f46700 1 -- 10.1.2.1:6789/0 <== > > client.39105 10.1.2.100:0/1027991 8 ==== mon_command({"var": > > "crush_ruleset", "prefix": "osd pool set", "pool": "periph", "val": > > "2"} v 0) v1 ==== 122+0+0 (2844980124 0 0) 0x3b1d860 con 0x2a578c0 > > -2> 2014-07-23 14:43:37.598602 7f52d2f46700 0 mon.ceph-mon0 at 0(leader) > > e2 handle_command mon_command({"var": "crush_ruleset", "prefix": "osd > > pool set", "pool": "periph", "val": "2"} v 0) v1 > > -1> 2014-07-23 14:43:37.598705 7f52d2f46700 1 > > mon.ceph-mon0 at 0(leader).paxos(paxos active c 542663..543338) > > is_readable now=2014-07-23 14:43:37.598708 lease_expire=2014-07-23 > > 14:43:41.683421 has v0 lc 543338 > > 0> 2014-07-23 14:43:37.601706 7f52d2f46700 -1 *** Caught signal > > (Segmentation fault) ** > > in thread 7f52d2f46700 > > > > > > Then, after an election try the second monitor goes down also : > > > > -1> 2014-07-23 14:43:51.772370 7eff4ba15700 1 > > mon.ceph-mon1 at 1(leader).paxos(paxos active c 542663..543338) > > is_readable now=2014-07-23 14:43:51.772373 lease_expire=2014-07-23 > > 14:43:56.770906 has v0 lc 543338 > > 0> 2014-07-23 14:43:51.775817 7eff4ba15700 -1 *** Caught signal > > (Segmentation fault) ** > > > > I can not reactivate the monitors until the command > > "ceph osd pool set periph crush_ruleset 2" is running. > > > > When I kill this command, then the monitors can run > > again and retrieve a normal state but it leaves the > > cluster with some warnings about data replacements > > not achieved I guess (I had some data in the pool). > > > > cluster 4a8669b9-b379-43b2-9488-7fca6e1366bc > > health HEALTH_WARN 152 pgs peering; 166 pgs stuck inactive; 620 pgs > > stuck unclean; recovery 620/694410 objects degraded (0.089%) > > monmap e2: 3 mons at > > {ceph-mon0=10.1.2.1:6789/0,ceph-mon1=10.1.2.2:6789/0,ceph-mon2=10.1.2.3:6789/0}, > > election epoch 50, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2 > > osdmap e1688: 16 osds: 16 up, 16 in > > pgmap v300875: 712 pgs, 5 pools, 1350 GB data, 338 kobjects > > 2765 GB used, 5614 GB / 8379 GB avail > > 620/694410 objects degraded (0.089%) > > 14 inactive > > 152 peering > > 454 active+remapped > > 92 active+clean > > > > Is there something that I did wrong or forgot to do ? > > > > While writing this mail down I realised that there is > > only one host in the rack2-pdu3 rack. May this be a > > cause of the problem ? > > > > Thanks for any hints. > > > > > -- > Joao Eduardo Luis > Software Engineer | http://inktank.com | http://ceph.com >