Re: CRUSH question - failing to rebalance after failure test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 8 Jan 2015, Christopher Kunz wrote:
> Am 05.01.15 um 15:16 schrieb Christopher Kunz:
> > Hi all,
> > 
> > I think I have a subtle problem with either understanding CRUSH or in
> > the actual implementation of my CRUSH map.
> > 
> > Consider the following CRUSH map: http://paste.debian.net/hidden/085b3f20/

This link doesn't seem to work any more?

> > I have 3 chassis' with 7 nodes each (6 of them OSDs). Size is 3,
> > min_size is 2 on all pools.
> > If i remove one chassis from the cluster (pull the network plugs, in
> > this case), my naive first thought was that the cluster might recover
> > fully, but I think this cannot be the case since it will never find a
> > location that can satisfy the necessary conditions "provide three
> > replicas on different chassis" - as there's only two in operaiton.
> > 
> > However, after setting "size" to 2 on all pools, the cluster recovered
> > from 33.3% degraded to 20.5% degraded, and is now sitting there.
> > 
> > This is a lab cluster, I'm really only trying to understand what's
> > happening. Can someone clear that up - I think i'm blind...
> > 
> > Regards,
> 
> Does nobody have an idea? This seems like basic functionality to me,
> still it's not working as intended.

Hmm, if I'm understanding correctly CRUSH is supposed to find the 2 
replicas in the 2 racks and recover completely.  The caveat is that CRUSH 
does this by sampling, and if there are lots of osds still in the map in 
the third rack that are bad choices and it runs out of retries it will 
fail to find a good mapping for some PGs.  Increasing the retries tunable 
will help in that case.  It's hard to say if that's what's going on in 
your case tho without seeing the map...

sage


> Is this a bug in Giant?
> 
> Regards,
> 
> --ck
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux