Re: How to add 100 new OSDs...

Janne Johansson <icepic.dz@xxxxxxxxx> · Thu, 25 Jul 2019 11:31:52 +0200

Den tors 25 juli 2019 kl 10:47 skrev 展荣臻（信泰） <zhanrzh_xt@xxxxxxxxxxxxxx>:

1、Adding osds in same one failure domain is to ensure only one PG in pg up set (ceph pg dump shows)to remap.2、Setting "osd_pool_default_min_size=1" is to ensure objects to read/write uninterruptedly while pg remap.
Is this wrong?

How did you read the first email where he described how 3 copies was not enough, wanting to perhaps go to 4 copies
to make sure he is not putting data at risk?

The effect you describe is technically correct, it will allow writes to pass, but it would also go 100% against what ceph tries to do here, retain the data even while doing planned maintenance, even while getting unexpected downtime.

Setting min_size=1 means you don't care at all for your data, and that you will be placing it under extreme risks.

Not only will that single copy be a danger, but you can easily get into a situation where your singlecopy-write gets accepted and then that drive gets destroyed, and the cluster will know the latest writes ended up on it, and even getting the two older copies back will not help, since it has already registered that somewhere there is a newer version. For a single object, reverting to older (if possible) isn't all that bad, but for a section in the middle of a VM drive, that could mean total disaster.

There are lots of people losing data with 1 copy, lots of posts on how repl_size=2, min_size=1 lost data for people using ceph, so I think posting advice to that effect goes against what ceph is good for.

Not that I think the original poster would fall into that trap, but others might find this post later and think that it would be a good solution to maximize risk while adding/rebuilding 100s of OSDs. I don't agree.

Den tors 25 juli 2019 kl 04:36 skrev zhanrzh_xt@xxxxxxxxxxxxxx <zhanrzh_xt@xxxxxxxxxxxxxx>:

I think it should to set "osd_pool_default_min_size=1" before you add osd ,
and the osd that you add  at a time  should in same Failure domain.

That sounds like weird or even bad advice?
What is the motivation behind it?

-- 
May the most significant bit of your life be positive.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com