答复: too many PGs per OSD (307 > max 300)

zhu tong <besthopeall@xxxxxxxxxxx> · Fri, 29 Jul 2016 03:18:10 +0000

The same problem is confusing me recently too, trying to figure out the relationship (an equation would be the best) among number of pools, OSD and PG.

For example, having 10 OSD, 7 pools in one cluster, and osd_pool_default_pg_num = 128, then how many PGs the health status would show?
I have seen some recommended calc the other way round -- inferring osd_pool_default_pg_num
  value by giving a fixed amount of OSD and PGs, but when I try it in the way above mentioned, the two results do not match.

Thanks.

发件人: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> 代表 Christian Balzer <chibi@xxxxxxx>

发送时间: 2016年7月29日 2:47:59

收件人: ceph-users

主题: Re:  too many PGs per OSD (307 > max 300)

On Fri, 29 Jul 2016 09:59:38 +0800 Chengwei Yang wrote:

> Hi list,

> 

> I just followed the placement group guide to set pg_num for the rbd pool.

> 

How many other pools do you have, or is that the only pool?

The numbers mentioned are for all pools, not per pool, something that

isn't abundantly clear from the documentation either.

>   "

>   Less than 5 OSDs set pg_num to 128

>   Between 5 and 10 OSDs set pg_num to 512

>   Between 10 and 50 OSDs set pg_num to 4096

>   If you have more than 50 OSDs, you need to understand the tradeoffs and how to

>   calculate the pg_num value by yourself

>   For calculating pg_num value by yourself please take help of pgcalc tool

>   "

> 

You should have headed the hint about pgcalc, which is by far the best

thing to do.

The above numbers are an (imprecise) attempt to give a quick answer to a

complex question.

> Since I have 40 OSDs, so I set pg_num to 4096 according to the above

> recommendation.

> 

> However, after configured pg_num and pgp_num both to 4096, I found that my

> ceph cluster in **HEALTH_WARN** status, which does surprised me and still

> confusing me.

> 

PGcalc would recommend 2048 PGs at most (for a single pool) with 40 OSDs.

I assume the above high number of 4096 stems from the wisdom that with

small clusters more PGs than normally recommended (100 per OSD) can be

helpful. 

It was also probably written before those WARN calculations were added to

Ceph.

The above would better read:

---

Use PGcalc!

[...]

Between 10 and 20 OSDs set pg_num to 1024

Between 20 and 40 OSDs set pg_num to 2048

Over 40 definitely use and understand PGcalc.

---

> >   cluster bf6fa9e4-56db-481e-8585-29f0c8317773

>      health HEALTH_WARN

>             too many PGs per OSD (307 > max 300)

> 

> I see the cluster also says "4096 active+clean" so it's safe, but I do not like

> the HEALTH_WARN in anyway.

>

You can ignore it, but yes, it is annoying.

> As I know(from ceph -s output), the recommended pg_num per OSD is [30, 300], any

> other pg_num out of this range with bring cluster to HEALTH_WARN.

> 

> So what I would like to say: is the document misleading? Should we fix it?

> 

Definitely.

Christian

-- 

Christian Balzer        Network/Systems Engineer                

chibi@xxxxxxx    Global OnLine Japan/Rakuten Communications

http://www.gol.com/

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com