Ok, thank you. I thought I have to set ceph to a tunables profile. If I’m right, then I just have to export the current crush map, edit it and import it again, like: ceph osd getcrushmap -o /tmp/crush crushtool -i /tmp/crush --set-choose-total-tries 100 -o /tmp/crush.new ceph osd setcrushmap -i /tmp/crush.new Is this right or not? I started this cluster with these 3 nodes and each 3 osds. They are vms. I knew that this cluster would expand very big, that’s the reason for my choice for ceph. Now I can’t add more HDDs to the vm hypervisor and I want to separate the nodes physically too. I bought a new node with these 4 drives and now another node with only 2 drives. As I hear now from several people this was not a good idea. For this reason, I bought now additional HDDs for the new node, so I have two with the same amount of HDDs and size. In the next 1-2 months I will get the third physical node and then everything should be fine. But at this time I have no other option. May it help to solve this problem by adding the 2 new HDDs to the new ceph node? > Am 11.01.2017 um 12:00 schrieb Brad Hubbard <bhubbard@xxxxxxxxxx>: > > Your current problem has nothing to do with clients and neither does > choose_total_tries. > > Try setting just this value to 100 and see if your situation improves. > > Ultimately you need to take a good look at your cluster configuration > and how your crush map is configured to deal with that configuration > but start with choose_total_tries as it has the highest probability of > helping your situation. Your clients should not be affected. > > Could you explain the reasoning behind having three hosts with one ods > each, one host with two osds and one with four? > > You likely need to tweak your crushmap to handle this configuration > better or, preferably, move to a more uniform configuration. > > > On Wed, Jan 11, 2017 at 5:38 PM, Marcus Müller <mueller.marcus@xxxxxxxxx> wrote: >> I have to thank you all. You give free support and this already helps me. >> I’m not the one who knows ceph that good, but everyday it’s getting better >> and better ;-) >> >> According to the article Brad posted I have to change the ceph osd crush >> tunables. But there are two questions left as I already wrote: >> >> - According to >> http://docs.ceph.com/docs/master/rados/operations/crush-map/#tunables there >> are a few profiles. My needed profile would be BOBTAIL (CRUSH_TUNABLES2) >> wich would set choose_total_tries to 50. For the beginning better than 19. >> There I also see: "You can select a profile on a running cluster with the >> command: ceph osd crush tunables {PROFILE}“. My question on this is: Even if >> I run hammer, is it good and possible to set it to bobtail? >> >> - We can also read: >> WHICH CLIENT VERSIONS SUPPORT CRUSH_TUNABLES2 >> - v0.55 or later, including bobtail series (v0.56.x) >> - Linux kernel version v3.9 or later (for the file system and RBD kernel >> clients) >> >> And here my question is: If my clients use librados (version hammer), do I >> need to have this required kernel version on the clients or the ceph nodes? >> >> I don’t want to have troubles at the end with my clients. Can someone answer >> me this, before I change the settings? >> >> >> Am 11.01.2017 um 06:47 schrieb Shinobu Kinjo <skinjo@xxxxxxxxxx>: >> >> >> Yeah, Sam is correct. I've not looked at crushmap. But I should have >> noticed what troublesome is with looking at `ceph osd tree`. That's my >> bad, sorry for that. >> >> Again please refer to: >> >> http://www.anchor.com.au/blog/2013/02/pulling-apart-cephs-crush-algorithm/ >> >> Regards, >> >> >> On Wed, Jan 11, 2017 at 1:50 AM, Samuel Just <sjust@xxxxxxxxxx> wrote: >> >> Shinobu isn't correct, you have 9/9 osds up and running. up does not >> equal acting because crush is having trouble fulfilling the weights in >> your crushmap and the acting set is being padded out with an extra osd >> which happens to have the data to keep you up to the right number of >> replicas. Please refer back to Brad's post. >> -Sam >> >> On Mon, Jan 9, 2017 at 11:08 PM, Marcus Müller <mueller.marcus@xxxxxxxxx> >> wrote: >> >> Ok, i understand but how can I debug why they are not running as they >> should? For me I thought everything is fine because ceph -s said they are up >> and running. >> >> I would think of a problem with the crush map. >> >> Am 10.01.2017 um 08:06 schrieb Shinobu Kinjo <skinjo@xxxxxxxxxx>: >> >> e.g., >> OSD7 / 3 / 0 are in the same acting set. They should be up, if they >> are properly running. >> >> # 9.7 >> <snip> >> >> "up": [ >> 7, >> 3 >> ], >> "acting": [ >> 7, >> 3, >> 0 >> ], >> >> <snip> >> >> Here is an example: >> >> "up": [ >> 1, >> 0, >> 2 >> ], >> "acting": [ >> 1, >> 0, >> 2 >> ], >> >> Regards, >> >> >> On Tue, Jan 10, 2017 at 3:52 PM, Marcus Müller <mueller.marcus@xxxxxxxxx> >> wrote: >> >> >> That's not perfectly correct. >> >> OSD.0/1/2 seem to be down. >> >> >> >> Sorry but where do you see this? I think this indicates that they are up: >> osdmap e3114: 9 osds: 9 up, 9 in; 4 remapped pgs? >> >> >> Am 10.01.2017 um 07:50 schrieb Shinobu Kinjo <skinjo@xxxxxxxxxx>: >> >> On Tue, Jan 10, 2017 at 3:44 PM, Marcus Müller <mueller.marcus@xxxxxxxxx> >> wrote: >> >> All osds are currently up: >> >> health HEALTH_WARN >> 4 pgs stuck unclean >> recovery 4482/58798254 objects degraded (0.008%) >> recovery 420522/58798254 objects misplaced (0.715%) >> noscrub,nodeep-scrub flag(s) set >> monmap e9: 5 mons at >> {ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0,ceph5=192.168.60.11:6789/0} >> election epoch 478, quorum 0,1,2,3,4 >> ceph1,ceph2,ceph3,ceph4,ceph5 >> osdmap e3114: 9 osds: 9 up, 9 in; 4 remapped pgs >> flags noscrub,nodeep-scrub >> pgmap v9981077: 320 pgs, 3 pools, 4837 GB data, 19140 kobjects >> 15070 GB used, 40801 GB / 55872 GB avail >> 4482/58798254 objects degraded (0.008%) >> 420522/58798254 objects misplaced (0.715%) >> 316 active+clean >> 4 active+remapped >> client io 56601 B/s rd, 45619 B/s wr, 0 op/s >> >> This did not chance for two days or so. >> >> >> By the way, my ceph osd df now looks like this: >> >> ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR >> 0 1.28899 1.00000 3724G 1699G 2024G 45.63 1.69 >> 1 1.57899 1.00000 3724G 1708G 2015G 45.87 1.70 >> 2 1.68900 1.00000 3724G 1695G 2028G 45.54 1.69 >> 3 6.78499 1.00000 7450G 1241G 6208G 16.67 0.62 >> 4 8.39999 1.00000 7450G 1228G 6221G 16.49 0.61 >> 5 9.51500 1.00000 7450G 1239G 6210G 16.64 0.62 >> 6 7.66499 1.00000 7450G 1265G 6184G 16.99 0.63 >> 7 9.75499 1.00000 7450G 2497G 4952G 33.52 1.24 >> 8 9.32999 1.00000 7450G 2495G 4954G 33.49 1.24 >> TOTAL 55872G 15071G 40801G 26.97 >> MIN/MAX VAR: 0.61/1.70 STDDEV: 13.16 >> >> As you can see, now osd2 also went down to 45% Use and „lost“ data. But I >> also think this is no problem and ceph just clears everything up after >> backfilling. >> >> >> Am 10.01.2017 um 07:29 schrieb Shinobu Kinjo <skinjo@xxxxxxxxxx>: >> >> Looking at ``ceph -s`` you originally provided, all OSDs are up. >> >> osdmap e3114: 9 osds: 9 up, 9 in; 4 remapped pgs >> >> >> But looking at ``pg query``, OSD.0 / 1 are not up. Are they something >> >> >> That's not perfectly correct. >> >> OSD.0/1/2 seem to be down. >> >> like related to ?: >> >> Ceph1, ceph2 and ceph3 are vms on one physical host >> >> >> Are those OSDs running on vm instances? >> >> # 9.7 >> <snip> >> >> "state": "active+remapped", >> "snap_trimq": "[]", >> "epoch": 3114, >> "up": [ >> 7, >> 3 >> ], >> "acting": [ >> 7, >> 3, >> 0 >> ], >> >> <snip> >> >> # 7.84 >> <snip> >> >> "state": "active+remapped", >> "snap_trimq": "[]", >> "epoch": 3114, >> "up": [ >> 4, >> 8 >> ], >> "acting": [ >> 4, >> 8, >> 1 >> ], >> >> <snip> >> >> # 8.1b >> <snip> >> >> "state": "active+remapped", >> "snap_trimq": "[]", >> "epoch": 3114, >> "up": [ >> 4, >> 7 >> ], >> "acting": [ >> 4, >> 7, >> 2 >> ], >> >> <snip> >> >> # 7.7a >> <snip> >> >> "state": "active+remapped", >> "snap_trimq": "[]", >> "epoch": 3114, >> "up": [ >> 7, >> 4 >> ], >> "acting": [ >> 7, >> 4, >> 2 >> ], >> >> <snip> >> >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > > > > -- > Cheers, > Brad _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com