Re: too many PGs per OSD (307 > max 300)

Christian Balzer <chibi@xxxxxxx> · Fri, 29 Jul 2016 11:47:59 +0900

On Fri, 29 Jul 2016 09:59:38 +0800 Chengwei Yang wrote:

> Hi list,
> 
> I just followed the placement group guide to set pg_num for the rbd pool.
> 
How many other pools do you have, or is that the only pool?

The numbers mentioned are for all pools, not per pool, something that
isn't abundantly clear from the documentation either.

>   "
>   Less than 5 OSDs set pg_num to 128
>   Between 5 and 10 OSDs set pg_num to 512
>   Between 10 and 50 OSDs set pg_num to 4096
>   If you have more than 50 OSDs, you need to understand the tradeoffs and how to
>   calculate the pg_num value by yourself
>   For calculating pg_num value by yourself please take help of pgcalc tool
>   "
> 
You should have headed the hint about pgcalc, which is by far the best
thing to do.
The above numbers are an (imprecise) attempt to give a quick answer to a
complex question.

> Since I have 40 OSDs, so I set pg_num to 4096 according to the above
> recommendation.
> 
> However, after configured pg_num and pgp_num both to 4096, I found that my
> ceph cluster in **HEALTH_WARN** status, which does surprised me and still
> confusing me.
> 
PGcalc would recommend 2048 PGs at most (for a single pool) with 40 OSDs.

I assume the above high number of 4096 stems from the wisdom that with
small clusters more PGs than normally recommended (100 per OSD) can be
helpful. 
It was also probably written before those WARN calculations were added to
Ceph.

The above would better read:
---
Use PGcalc!
[...]
Between 10 and 20 OSDs set pg_num to 1024
Between 20 and 40 OSDs set pg_num to 2048

Over 40 definitely use and understand PGcalc.
---

> >   cluster bf6fa9e4-56db-481e-8585-29f0c8317773
>      health HEALTH_WARN
>             too many PGs per OSD (307 > max 300)
> 
> I see the cluster also says "4096 active+clean" so it's safe, but I do not like
> the HEALTH_WARN in anyway.
>
You can ignore it, but yes, it is annoying.

> As I know(from ceph -s output), the recommended pg_num per OSD is [30, 300], any
> other pg_num out of this range with bring cluster to HEALTH_WARN.
> 
> So what I would like to say: is the document misleading? Should we fix it?
> 
Definitely.

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com