CRUSH question - failing to rebalance after failure test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I think I have a subtle problem with either understanding CRUSH or in
the actual implementation of my CRUSH map.

Consider the following CRUSH map: http://paste.debian.net/hidden/085b3f20/

I have 3 chassis' with 7 nodes each (6 of them OSDs). Size is 3,
min_size is 2 on all pools.
If i remove one chassis from the cluster (pull the network plugs, in
this case), my naive first thought was that the cluster might recover
fully, but I think this cannot be the case since it will never find a
location that can satisfy the necessary conditions "provide three
replicas on different chassis" - as there's only two in operaiton.

However, after setting "size" to 2 on all pools, the cluster recovered
from 33.3% degraded to 20.5% degraded, and is now sitting there.

This is a lab cluster, I'm really only trying to understand what's
happening. Can someone clear that up - I think i'm blind...

Regards,

--ck
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux