On Thu, 29 Dec 2016, Łukasz Chrustek wrote: > Hi, > > Thank You very much for analize and the file ! I had similar :) but > wasn't sure if it wan't distroy something in cluster. > > > The encoded tree bucket -11 had bad values. I don't really trust the tree > > bucket code in crush... it's not well tested (and is a poor balance > > computation and efficiency anyway). We should probably try to remove tree > > entirely. > > > I've attached a fixed map that you can inject with > > > ceph osd setcrushmap -i <filename> > > Now it works, and also ceph osd crush dump -f json-pretty runs OK. Great news! > > Bucket -11 is now empty; not sure what was supposed to be in it. > > this server will be reinstalled, there where three osds. > > > I suggest switching all of your tree buckets over to straw2 as soon as > > possible. Note that this will result in some rebalancing. You could do > > it one bucket a time if that's concerning. > > OK, changing alg to straw2 will rebalance ale PGs on all nodes ? For any bucket you change from tree -> straw2, you'll see PGs shuffle between the children of that bucket. So for hosts, you'll see data move between the disks. And fixing ssd-intel-s3700 will shuffle data between hosts. I'd also switch the straw buckets to straw2, although that will move a comparatively small amount of data. sage