I appreciate you giving more detail on this. I plan on expanding the test cluster to 5 servers soon, so I'll just wait until then before changing the number of replicas. Thanks, Bryan On Tue, Jan 8, 2013 at 3:49 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: > Yep! The "step chooseleaf firstn 0 type host" means "choose n nodes of > type host, and select a leaf under each one of them", where n is the > pool size. You only have two hosts so it can't do more than 2 with > that rule type. > You could do "step chooseleaf firstn 0 type device", but that won't > guarantee a segregation across hosts, unfortunately. CRUSH isn't great > at dealing with situations where you want your number of copies to be > equal to or greater than your total failure domain counts. You can > make it work if you're willing to hardcode some stuff but it's not > real pleasant. > -Greg > > On Tue, Jan 8, 2013 at 2:28 PM, Bryan Stillwell > <bstillwell@xxxxxxxxxxxxxxx> wrote: >> That would make sense. Here's what the metadata rule looks like: >> >> rule metadata { >> ruleset 1 >> type replicated >> min_size 2 >> max_size 10 >> step take default >> step chooseleaf firstn 0 type host >> step emit >> } >> >> On Tue, Jan 8, 2013 at 3:23 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>> What are your CRUSH rules? Depending on how you set this cluster up, >>> it might not be placing more than one replica in a single host, and >>> you've only got two hosts so it couldn't satisfy your request for 3 >>> copies. >>> -Greg >>> >>> On Tue, Jan 8, 2013 at 2:11 PM, Bryan Stillwell >>> <bstillwell@xxxxxxxxxxxxxxx> wrote: >>>> I tried increasing the number of metadata replicas from 2 to 3 on my >>>> test cluster with the following command: >>>> >>>> ceph osd pool set metadata size 3 >>>> >>>> >>>> Afterwards it appears that all the metadata placement groups switch to >>>> a degraded state and doesn't seem to be attempting to recover: >>>> >>>> 2013-01-08 14:49:37.352735 mon.0 [INF] pgmap v156393: 1920 pgs: 1280 >>>> active+clean, 640 active+degraded; 903 GB data, 1820 GB used, 2829 GB >>>> / 4650 GB avail; 1255/486359 degraded (0.258%) >>>> >>>> >>>> Does anything need to be done after increasing the number of replicas? >>>> >>>> Here's what the OSD tree looks like: >>>> >>>> root@a1:~# ceph osd tree >>>> dumped osdmap tree epoch 1303 >>>> # id weight type name up/down reweight >>>> -1 4.99557 pool default >>>> -3 4.99557 rack unknownrack >>>> -2 2.49779 host b1 >>>> 0 0.499557 osd.0 up 1 >>>> 1 0.499557 osd.1 up 1 >>>> 2 0.499557 osd.2 up 1 >>>> 3 0.499557 osd.3 up 1 >>>> 4 0.499557 osd.4 up 1 >>>> -4 2.49779 host b2 >>>> 5 0.499557 osd.5 up 1 >>>> 6 0.499557 osd.6 up 1 >>>> 7 0.499557 osd.7 up 1 >>>> 8 0.499557 osd.8 up 1 >>>> 9 0.499557 osd.9 up 1 >>>> >>>> >>>> Thanks, >>>> Bryan >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> >> -- >> >> >> Bryan Stillwell >> SYSTEM ADMINISTRATOR >> >> E: bstillwell@xxxxxxxxxxxxxxx >> O: 303.228.5109 >> M: 970.310.6085 -- Bryan Stillwell SYSTEM ADMINISTRATOR E: bstillwell@xxxxxxxxxxxxxxx O: 303.228.5109 M: 970.310.6085 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html