Re: Shall host weight auto reduce on hdd failure?

Milan Kupcevic <milan_kupcevic@xxxxxxxxxxx> · Wed, 4 Dec 2019 18:28:04 -0500

On 2019-12-04 04:11, Janne Johansson wrote:
> Den ons 4 dec. 2019 kl 01:37 skrev Milan Kupcevic
> <milan_kupcevic@xxxxxxxxxxx <mailto:milan_kupcevic@xxxxxxxxxxx>>:
> 
>     This cluster can handle this case at this moment as it has got plenty of
>     free space. I wonder how is this going to play out when we get to 90% of
>     usage on the whole cluster. A single backplane failure in a node takes
> 
> 
> You should not run any file storage system to 90% full, ceph or otherwise.
> 
> You should set a target for how full it can get before you must add new
> hardware to it, be it more drives or hosts with drives, and as noted
> below, you should probably include at least one failed node into this
> calculation, so that planned maintenance doesn't become a critical
> situation. 

There is plenty of space to take more than a few failed nodes. But the
question was about what is going on inside a node with a few failed
drives. Current Ceph behavior keeps increasing number of placement
groups on surviving drives inside the same node. It does not spread them
across the cluster. So, lets get back to he original question. Shall
host weight auto reduce on hdd failure, or not?

Milan

-- 
Milan Kupcevic
Senior Cyberinfrastructure Engineer at Project NESE
Harvard University
FAS Research Computing
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com